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PREFACE 


Under  Public  Law  89-306  (Brooks  Act),  the  National 
Bureau  of  Standards  (Institute  for  Computer  Sciences  and 
Technology)  has  responsibilities  to  develop  Federal  Informa- 
tion Processing  Standards  (FIPS).  Three  major  goals  are 
sought  in  Federal  standards:  improved  competition  among 
vendors  providing  computer  systems  or  services  to  the 
government,  improved  procurement  procedures,  and  improved 
interchange  of  data  and  programs  within  the  Federal  govern- 
ment. FIPS  publications  may  be  either  standards  or  guide- 
lines. A  standard  is  a  precise  statement  of  required  func- 
tions or  actions,  while  a  guideline  advises  and  suggests  ac- 
tions. Examples  of  standards  range  from  the  data  codes  for 
state  abbreviations  to  the  complex  COBOL  language  specifica- 
tions. Examples  of  guidelines  include  the  published  recom- 
mendations for  physical  computer  security  and  privacy  pro- 
tection. 


To  assist  the  National  Bureau  of  Standards  in  its  con- 
sideration of  FIPS  standards.  Task  Groups  are  sometimes  es- 
tablished to  address  specific  subject  areas.  These  Task 
Groups  are  advisory  bodies  made  up  of  volunteer  participants 
from  Federal  agencies.  Task  Group  24  on  Data  Base  Manage- 
ment Systems  is  such  a  group.  TG-24  purpose,  scope,  and  pro- 
gram of  work  are  contained  in  its  charter.  Appendix  1  of 
this  report. 

The  issues  addressed  by  Task  Group  24  are  important, 
complex,  and  highly  technical.  The  recommendations  of  the 
Task  Group  are  valued  contributions  to  our  work  as  represen- 
tative statements  of  requirements.  They  are  not  necessarily 
the  technical  judgments  or  current  positions  of  NBS.  These 
views  provide  a  concrete  reference  point  for  others  to  add 
their  comments  and  recommendations.  In  each  area  addressed 
by  the  Task  Group,  NBS  has  underway  a  thorough  study,  in- 
cluding a  cost-benefit  analysis,  leading  towards  a  proposed 
Federal  standard  if  warranted  by  the  conclusions  of  continu- 
ing study.  Our  analyses  coupled  with  continuing  input  from 
Federal  agencies  will  guide  the  final  decisions  on  Federal 
data  base  standardization.  Consequently,  we  publish  this 
report  to  invite  additional  comment.  We  will  continue  to 
seek  comment  as  we  proceed  through  the  various  steps  toward 
standard i  zat  i on . 


S.  Jeffery,  Director 
Center  for  Programming- 
Science  and  Technology 
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EXECUTIVE  SUMMARY 


The  need  for  standardization  in  database  management 
results  from  the  increased  usage  of  database  software  and 
the  increased  demand  for  data  interchange  within  the  Federal 
government.  To  meet  this  need.  Task  Group  24  recommends  the 
following  actions: 


++Te  rmi  nol ogy . 


1.  A  standard  set  of  data  base  oriented  terms  should  be 
established,  should  be  coordinated  with  the  work  of 
ISO,  and  should  be  included  in  the  current  ANSI 
standard  data  processing  glossary. 

2.  Guidelines  should  be  established  which  encourage 
DBMS  developers  to  use  this  glossary  when  describing 
database  concepts. 

3.  Guidelines  should  be  established  whereby  vendors  are 
encouraged  to  utilize  DBMS  language  syntax  which  is 
compatible  with  this  glossary. 


++Da ta  Descri  pt i  on . 


1.  A  two-part  data  description  standard  is  required 
that  contains  the  common  description  of  the  data 
element  (attributes)  and  the  facility  to  describe 
multiple  data  structure  classes. 

2.  The  specification  of  the  standard  description  of 
data  attributes  should  be  similar  to  the  attributes 
in  the  PICTURE  and  TYPE  clauses  of  the  CODASYL  DDL. 


3.  The  specification  of  the  standard  description  of 
data  structures  should  be  required  to  encompass 
current  data  models  such  as  the  hierarchical,  net- 
work, and  relational. 

4.  Consistent  with  recommendation  1,  the  standard 
description  of  the  network  data  structure  within  the 
DDL  should  be  based  on  the  CODASYL  DDL. 

5.  Consistent  with  recommendation  1,  companion  data 
structure  descriptions  for  the  hierarchical  and  re- 
lational data  models  should  be  developed. 
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1. 


Develop  multiple  Data 
dards  specifications. 


Manipulation  Language 


stan- 


2,       [Develop    a     data    manipulation     language  standard 
specification:] 

(a)  For  each  standard  host  programming  language 
(e.g.,  COBOL,  FORTRAN),  develop  immediately  a  stan- 
dard DML  specification  for  each  category  identified 
by  TG-24. 

(b)  As  a  short  range  goal,  develop  a  single  standard 
DML  specification  for  a  given  category  that  inter- 
faces with  all  standard  host  programming  languages. 

(c)  As  a  long  range  goal,  develop  a  single  standard 
DML  specification  containing  the  functionality  of 
all  categories. 


++Da_ta  D i  c t  i  0 n a ry . 


1.  Data  dictionaries  used  by  Federal  agencies  must  be 
able  to  produce  the  standard  DDL  attribute  descrip- 
tion as  recommended  in  Section  3.3.2  by  the  DDL  Sub- 
committee of  TG-24.  (TG-24  took  no  position  on  the 
standardization  of  data  dictionaries  but  addressed 
only  those  data  dictionaries  with  an  interface 
between  the  data  dictionaries  and  the  DBMS  which 
must  be  standardized.) 

2.  The  design  of  data  dictionary  must  be  capable  of 
combining  standard  data  attribute  descriptions  with 
data  structure  descriptions  to  generate  the  DDL  for 
one  or  more  DBMS. 

3.  Establish  guidelines  on  the  data  dictionary  usage. 


■n-End-user/query  F a c i  1  i t i e s 


1. 


2. 


Standardi  zation 
f  ac  i 1 i  t  i  es  is 
easily  learned, 
dent,  and  there 
"styl es." 


of  syntax  and  semantics  of  end-user 
not  required.  Such  facilities  are 
problem  and  subject-matter  depen- 
are  a  diversity  of  end-user  facility 


Guidelines  should  be  developed  to  aid  the  specifica- 
tion of  requirements  of  end-user  facilities  for 
Federal   procurement  purposes. 


++$tandards  Adoption. 


1.  The  standards  recommended  should  not  be  mandatory 
until  other  factors  determine  a  change  to  the 
relevant  systems. 

2.  Standards  can  be  required  for  new  applications  or 
systems  without     requiring  existing  systems  to 
also  conform  to  the  standard. 

++Other  Major  Areas. 

1.  TG-24  recommends  that  the  database  standards  actions 
described  above  apply  to  database  facilities  pro- 
vided by  computer  services,  distributed  systems,  and 
mini  computers . 
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RECOMMENDATIONS  FOR 
DATABASE  MANAGEMENT  SYSTEM  STANDARDS 

The  Federal    Information  Processing  Standards 

Task  Group  on 
Database  Management  System  Standards 
(FIPS  TG-24) 


In  March,  1977  FIPS  Task  Group  24  initiated  a 
study  of  the  need  for  database  standards  within 
the  Federal  government.  The  voluntary  partici- 
pants from  several  Federal  agencies  considered  the 
actions  of  other  standards  bodies;  reviewed  the 
alternatives  to  Federal  standards;  examined  the 
issues  of  standards  adoption,  timing,  and  impact 
on  technology;  developed  a  method  for  justifying 
standards,  and  attempted  to  anticipate  likely  da- 
tabase technology  advancements. 

TG-24     recommended  standards      in  certain 

specific  technical  areas,  concluded  that  standards 

were  premature  in  others,  and  emphasized  the  need 
for  certain  guidelines. 

This  final  report  of  TG-24  contains  the 
recommendations  for  standards  and  guidelines  as 
well  as  the  assumptions,  benefits,  and  costs  con- 
siderations used  to  justify  the  recommendations. 

Key  words:  Database;  DBMS;  data-description; 
data-dictionary;  data-directory;  d a ta- mani pul a - 
tion;  languages;  query;  standards. 


1.  INTRODUCTION 


1.1     BRIEF  STATEMENT  OF  THE  PROBLEM 

Federal  government  usage  of  database  technology,  like 
the  rapid  growth  noted  in  private  industry,  increases  year- 
ly. Each  database  system  differs  from  the  others  and  inhi- 
bits the  interchange  of  skilled  personnel,  programs  or  data 
among  these  different  systems.  Even  similar  systems  have 
slight  differences  that  prevent  quick  interchange. 
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The  resultant  growth  in  the  number  of  diverse  database 
system  undermines  Federal  goals  sought  by  the  standardiza- 
tion of  programming  languages  and  other  software  components. 
In  view  of  the  predicted  hardware  conversions  within  the 
Federal  government,  and  the  even  greater  likelihood  of 
operating  system  and  peripheral  device  changes  over  the  next 
ten  years,  database  technology  will  not  meet  its  potential 
of  facilitating  conversion  and  may  even  worsen  the  conver- 
sion problem. 

Current  standards  work  is  being  performed  in  several 
areas  of  database  technology  but  many  other  areas  are  being 
overlooked.  While  voluntary  DBMS  standards  actions  in  the 
American  National  Standards  Insti tute(ANSI )  are  underway, 
these  actions  require  close,  cooperative  monitoring  to  in- 
sure that  they  will  meet  Federal   needs  and  time  frames. 


1.2     BRIEF  STATEMENT  OF  RECOMMENDED  SOLUTIONS 

This  report  contains  a  specific  recommendation  to 
develop  a  family  of  database  standards  for  the  Federal 
government.  The  Task  Group  recommends  the  following  ac- 
tions: 

++Terminol ogy . 

1.  A  standard  set  of  data  base  oriented  terms  should  be 
established,  should  be  coordinated  with  the  work  of 
ISO,  and  should  be  included  in  the  current  ANSI 
standard  data  processing  glossary. 

2.  Guidelines  should  be  established  which  encourage 
DBMS  developers  to  use  this  glossary  when  describing 
database  concepts. 

3.  Guidelines  should  be  established  whereby  vendors  are 
encouraged  to  utilize  DBMS  language  syntax  which  is 
compatible  with  this  glossary. 

++Da ta  Descri  pt  i  on . 

1.  A  two-part  data  description  standard  is  required 
that  contains  the  common  description  of  the  data 
element  (attributes)  and  the  facility  to  describe 
multiple  data  structure  classes. 

2.  The  specification  of  the  standard  description,  of 
data  attributes  should  be  similar  to  the  attributes 
in  the  PICTURE  and  TYPE  clauses  of  the  CODASYL  DDL. 
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3.  The  specification  of  the  standard  description  of 
data  structures  should  be  required  to  encompass 
current  data  models  such  as  the  hierarchical,  net- 
work, and  relational. 

4.  Consistent  with  recommendation  1,  the  standard 
description  of  the  network  data  structure  within  the 
DDL  should  be  based  on  the  CODASYL  DDL. 

5.  Consistent  with  recommendation  1,  companion  data 
structure  descriptions  for  the  hierarchical  and  re- 
lational  data  models  should  be  developed. 

++Data  Manipulation. 

1.  Develop  multiple  [Data  Manipulation  Language]  stan- 
dards specifications. 

2.  [Develop  a  data  manipulation  language  standard 
specification:] 

(a)  For  each  standard  host  programming  language 
(e.g.,  COBOL,  FORTRAN),  develop  immediately  a  stan- 
dard DML  specification  for  each  category  identified 
by  TG-24. 

(b)  As  a  short  range  goal,  develop  a  single  standard 
DML  specification  for  a  given  category  that  inter- 
faces with  all  standard  host  programming  languages. 

(c)  As  a  long  range  goal,  develop  a  single  standard 
DML  specification  containing  the  functionality  of 
al 1  categori  es . 

++Data  Dictionary. 

1.  Data  dictionaries  used  by  Federal  agencies  must  be 
able  to  produce  the  standard  DDL  attribute  descrip- 
tion as  recommended  in  Section  3.3.2  by  the  DDL  Sub- 
committee of  TG-24.  (TG-24  took  no  position  on  the 
standardization  of  data  dictionaries  but  addressed 
only  those  data  dictionaries  with  an  interface 
between  the  data  dictionaries  and  the  DBMS  which 
must  be  standardized.) 

2.  The  design  of  data  dictionary  must  be  capable  of 
combining  standard  data  attribute  descriptions  with 
data  structure  descriptions  to  generate  the  DDL  for 
one  or  more  DBMS. 
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3.     Establish  guidelines  on  the  data  dictionary  usage. 
++End-user/query  Facilities. 

1.  Standardization  of  syntax  and  semantics  of  end-user 
facilities  is  not  required.  Such  facilities  are 
easily  learned,  problem  and  subject-matter  depen- 
dent, and  there  are  a  diversity  of  end-user  facility 
"styl es." 

2.  Guidelines  should  be  developed  to  aid  the  specifica- 
tion of  requirements  of  end-user  facilities  for 
Federal   procurement  purposes. 

++Standards  Adoption. 

1.  he  standards  recommended  should  not  be  mandatory 
until  other  factors  determine  a  change  to  the 
rel evant  systems . 

2.  Standards  can  be  required  for  new  applications  or 
systems  without  requiring  existing  systems  to  also 
conform  to  the  standard. 

++Other  Major  Areas . 

1.  TG-24  recommends  that  the  database  standards  actions 
described  above  apply  to  data  base  facilities  pro- 
vided by  computer  services,  distributed  systems,  and 
minicomputers. 


1.3     PURPOSE  AND  METHOD  OF  FIPS  TG-24 

The  work  of  FIPS  TG-24  followed  its  charter  which  ap- 
pears in  Appendix  1  of  this  report.  This  charter  contains 
FIPS  TG-24's  purpose,  scope  and  program  of  work.  Nineteen 
Federal  agencies  contributed  to  this  report  over  the  course 
of  one  year.  Several  sub-Task  Groups  addressed  specific  sub- 
jects and  reported  their  findings  to  the  committee. 
Viewpoints  from  experts  in  the  field.  Federal  government 
agency  managers  facing  database  decisions,  and  published  ma- 
terial on  database  technology  assisted  the  Task  Group  in 
meeting  its  goals.  Subcommittees  were  formed  to  investigate 
and  propose  standards  for  each  component.  The  final  overall 
recommendations  were  reviewed  so  that  each  component's 
recommendations  are  consistent. 

In  addition  to  describing  specific  functional  capabili- 
ties, each  subcommittee  considered  and  addressed  issues  such 
a  s : 
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1.  A  single  standard  or  multiple  standards. 

2.  To  standardize  at  the  syntax  level  or  functional 
level. 

3.  The  nature  of  the  interface  between  the  components 
and  the  DBMS  as  a  whole  software  system,  and  the  in- 
terface between  components. 

4.  The  impact  of  the  proposed  standards  on  the  future 
DBMS  technology. 

5.  The  cost  and  benefit  of  the  proposed  standards 
within  each  component. 

6.  An  assessment  of  the  cost  of  implementing  and  main- 
taining the  proposed  standards. 

7.  A  plan  for  the  achievement  of  the  proposed  stan- 
dards. 

These  issues  are  individually  addressed  in  the  "Justifica- 
tion" subsections  within  the  Chapter  "Recommendations  for 
Standards." 


1.4    ACTIONS  OF  OTHER  STANDARDS  BODIES 


1.4.1  American  National  Standards  Institute.  In  autumn, 
1972  ,  the  American  National  Standards  insti  tute  (ANSI)  com- 
mittee on  Computers  and  Information  Processing  committee 
(X3)  through  its  Standards  Planning  and  Requirements  Commit- 
tee (SPARC)  established  a  Study  Group  on  Data  Base  Manage- 
ment Systems  with  a  charter  "to  investigate  the  potential 
for  standards."  The  Study  Group  issued  an  interim  report  in 
1975  and  a  final   report  in  July  1977.   [ANSI   75,  TSIC  77] 

The  final  report  contained  neither  specifications  for  a 
recommended  standard  nor  recommendations  for  any  action  for 
standardization  of  any  existing  products  or  specifications. 
The  report  does  contain  a  "framework"  which  can  be  used  to 
consider  future  standards  actions. 

After  accepting  the  report,  SPARC  initiated  three  per- 
tinent actions:  referred  actions  for  subschema  data  descrip- 
tion language  specifications  and  data  manipulation  language 
specifications  to  the  COBOL  committee,  referred  actions  for 
a  Subschema  data  description  language  specification  and  data 
manipulation  language  specification  to  the  FORTRAN  commit- 
tee, and  initiated  a  committee  for  data  description  language 
speci  f i  cat i  ons . 


-5- 


1.4.2  International  Standards  Organization*  Within  the 
International  Standards  Organization  (ISO),  Technical  Com- 
mittee 97/Study  Committee  5/Working  Group  3  on  DBMS  meets 
semiannually  and  has  a  very  ambitious  program  of  work.  Its 
Scope  of  Work  includes: 

1.  Define  concepts  for  conceptual   schema  languages 

2.  Define  or  monitor  definition     of    conceptual  schema 
1 angua  ge 

3.  Develop  a  methodology  for    assessing     proposals  for 
conceptual   schema  languages. 

4.  Assess  candidate     proposals     for    conceptual  schema 
languages 

5.  Define  concepts  for  conceptual   level   end  user  facil- 
ities 

6.  Define  conceptual   level   end  user  facilities 

7.  Take  cognizance  of  and     react     to     other    data  base 
developments  as  appropriate 

8.  Develop  vocabulary  for  Data  Base  Management  Systems 

1.4.3  Specification  Work  of  Other  Bodies.  The  Conference  on 
Data  Systems  Languages  ( CODAS  YL )  is  a  voluntary  body  that 
developed  the  Common  Business  Oriented  Language  (COBOL)  and 
guided  that  language's  evolutionary  development.  CODASYL 
developed  detailed  language  specifications  for  a  FORTRAN 
Data  Manipulation  Language  and  Subschema  Data  Description 
Language,  a  COBOL  Data  Manipulation  Language  and  Subschema 
Data  Description  Language,  a  host-language  independent  Data 
Description  Language,  and  a  draft  Data  Storage  Description 
Language.  The  FORTRAN  specifications  were  published  in  Janu- 
ary 1977  and  the  latter  three  will  be  published  in  April 
1  978. 

Though  not  a  standards  body  the  impact  of  CODASYL  work 
on  languages  is  well  known,  and  their  work  in  the  DBMS  areas 
has  had  significant  impact  already.  The  various  specifica- 
tion developing  committees  of  CODASYL  meet  at  intervals 
varying  from  every  six  weeks  to  three  or  four  months.  Ap- 
proximately 25  different  vendors  and  users  are  represented 
in  the  developmental  committees.  Proposals  for  improvements 
to  the  specifications  arrive  from  world-wide  sources,  in- 
cluding the  European  Computer  Manufacturing  Association,  the 
International  Federation  of  Information  Processing  So- 
cieties, and  several  vendor  user  bodies.  CODASYL  specifica- 
tions   will     continue    to     evolve    and  will  be  considered  as 
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standards  candidates  by  ANSI.  However,  CODASYL  is  not  a 
standards  making  body  and  CODASYL  specifications  can  not  be- 
come standards  through  CODASYL  actions  alone. 

1.4.4  Individual  Federal  Agency  Standards  Bodies.  Among  the 
1  a rger  Federal  agencies  participating  in  T G - 2 4 ,  it  was  noted 
that  several  had  their  own  standards-making  function  and 
some  were  currently  considering  DBMS  standards.  For  exam- 
ple, the  Department  of  Agriculture  has  already  established  a 
policy  to  use  exclusively  one  proprietary  DBMS  for  most  ap- 
plications. Similarly,  the  Army  reviewed  various  options 
related  to  reducing  the  number  of  DBMS  used  by  its  com- 
^ponen ts . 


2.     FEDERAL  DBMS  STANDARDS  QUESTIONS 


2.1     WHY  DBMS  STANDARDS? 


2.1.1  Survey  of  DBMS  Usage  By  Tg-24  Participants.  TG-24  sur- 
veyed DBMS  usage  of  those  Federal  agencies  participating  in 
TG-24  in  order  to  understand  to  some  degree  the  actual  usage 
of  DBMS  within  the  Federal  agencies.  The  survey  was  not  a 
"rigorous"  one  but  it  is  useful  in  indicating  the  likelihood 
of  significant  trends  in  the  Federal  agencies.  Details  of 
the  survey  and  its  findings  are  in  Appendix  2. 

The  informal  results  support  the  assumption  that 
Federal  agency  usage  of  database  systems  parallels  the 
growth  of  such  systems  in  private  industry.  Within  the  14 
agencies  surveyed,  57  distinct  DBMS  were  found.  A  few  years 
ago,  only  large,  special  purpose  database  systems  were  re- 
ported and  these  were  primarily  in  the  Department  of  De- 
fense. Therefore,  TG-24  inferred  a  significant  growth  in 
Federal  DBMS  usage  in  recent  years.  Detailed  determination 
of  the  quantities  of  program  code  and  data  now  committed  to 
DBMS  systems  awaits  a  more  comprehensive  and  carefully 
developed  survey. 

Contributing  to  the  difficulty  of  finding  information 
on  Federal  use  of  DBMS  is  the  lack  of  a  central  repository 
of  such  information.  TG-24  lacked  the  resources  to  conduct 
a  formal  user  survey.  Even  when  data  may  exist  it  is  not  in 
a  form  that  permits  ready  synthesis  of  the  information  need- 
ed. TG-24  recognized  the  value  such  information  would  have 
for  Federal  planning  and  encouraged  the  development  of  such 
statistics     but,     at     the     same  time,  understood  the  expense 
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which  may  be  involved  in  determining  them. 


2.1.2  Government  DBMS  Usage  Trends.  In  a  copyrighted  survey 
oT  iJFMS  usage  at  360/370  sites  [IDC  77],  International  Data 
Corporation  found  that  "of  861  sites  in  sample,  312  users 
(36.2%)  reported  usage  of  a  DBMS  at  yearend  1976,  and  the 
figure  climbed  to  433  sites  [planned]  by  yearend  1978 
(50.3%)."  If  the  same  percentages  hold  true  for  the  Federal 
Government's  8649  computers  (1975),  3131  sites  now  use  DBMS 
and  4350  will   in  1978. 


2.1.3  Current  Problems  In  Data  Base  Usage.  Several  Federal 
agency  managers  of  data  processing  shared  with  TG-24  the  is- 
sues that  concerned  them.     These  were  identified  as: 


1.  Need  for  a  common  functional  requirements  specifica- 
tion checklist  to  aid  in  the  procurement  of  database 
systems . 

2.  Need  for  guidance  on  when  to  use  what  database  sys- 
tem and  how  best  to  achieve  its  intended  objectives. 

3.  Fear  of  a  single,  universal  standard  that  prevents 
effective  use  of  databases  because  of  that  agency's 
particular  needs. 

4.  Need  to  identify  and  standardize  subcomponents. 

5.  Need  for  assistance  in  the  procurement  process. 

6.  Need  for  assistance  in  converting  to  the  standard. 

7.  Need  for  a  quick,  easy,  low  volume  access  to  data. 

8.  Need  for  a  standard  that  assists  in  reducing  change 
and  promotes  user  control  of  changes  to  product 
specifications  rather  than  vendor  control. 


2.2     IS  NOW  THE  TIME  FOR  DBMS  STANDARDS? 


2.2.1  Standards  Impact  On  Database  Technology.  The  Task 
Group  considered  specifically  the  possiblity  that  standardi- 
zation at  this  time  might  have  a  harmful  effect  on  database 
technology.  The  result  of  this  consideration  was  to  note 
how  difficult  such  a  hypothesis  was  to  prove  or  disprove. 
Further,  it  was  not  clear  whether  such  a  question  was  a 
proper  consideration  of  the  Task  Group.  Stated  more  direct- 
ly, the  recommendations  of  the  Task  Group  require  justifica- 
tion of  cost  savings  or  cost  avoidance  for  the  Federal 
government    as     a  whole.     Such  considerations  of  future  DBMS 
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technology  impacts  could  properly  be  considered  under  poten- 
tial or  future  costs  but  may  be  secondary  to  immediate  costs 
or  benefits. 

TG-24  reviewed  various  scenarios  in  which  "premature" 
database  standards  might  inhibit  new  technology.  The  Task 
Group  determined  that  difficulties  arose  immediately  upon 
trying  to  select  a  metric  for  considering  technology  growth. 
Certainly  subjective  conclusions  can  be  found  everywhere. 
However,  any  such  conclusions  must  consider  three  factors  -- 
the  type  of  standard  concerned,  the  manner  in  which  the 
selected  standard  is  promulgated,  and  the  extent  to  which  it 
is  accepted.  Different  combinations  of  these  factors  will 
affect  different  stages  of  program  development  cycle  with 
differing  impact  on  technology. 

TG-24  looked  for  precedents  in  past  standards  activi- 
ties. It  reviewed  the  ASCII,  COBOL,  and  MUMPS  standards 
history  and  found  no  obvious  instance  where  standards  have 
inhibited  or  adversely  affected  technological  growth. 

2.2.2  Current  Inventory  of  Standards  Candidates.  The  size  of 
the  inventory  of  potential  candidates  for  standardization 
depends  on  the  degree  of  detail  required  for  the  statement 
or  specification  of  the  candidate.  For  example,  the  various 
CODASYL  specifications  are  quite  detailed  statements  of  syn- 
tax and  semantics  presented  in  a  very  formal  manner.  On  the 
other  hand,  some  have  argued  that  the  user's  manual  of,  say, 
a  proprietary  database  system  is  equally  a  specification. 
Proprietary  systems  raise  a  second  issue:  the  availability 
of  the  specifications  for  use  by  all  interested  implemen- 
t ors . 

^.2.3  Sources  of  DBMS  Standard  Candidates.  Identifiable 
sources  of  standard  candidates  do  exist  and  should  be  exam- 
ined. To  aid  its  work,  TG-24  reviewed  the  following 
sources : 

0     Existing  computer  vendor  software 
0     Existing  proprietary  packages 

0     Existing  specifications  from  volunteer  developmental 
groups 

0     Existing  Federally  "owned"  software 

0     NBS  developed  specifications 

In  considering  each  of  the  broad  areas,  TG-24  noted  a  set  of 
common  requirements: 
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0    A  clear  and  precise  specification  that     permits  any 
vendor  to  implement  it 

0    The  general   availability  of  such  a  specification 

0    A  concrete  method  to  validate  the  specification  and 
mediate  disputes 

0    A  body  that  reviews  and  updates  the  specification 

0    A  method  to  validate  implementations  of  the  specifi- 
cation and  mediate  different  interpretations  of  it 

0    The  need  to  deal  with  the  impact    on    those  systems 
not  s el ect ed 

2.2.4  Timeframe  For  Standards.  Significant  time  requirements 
enter  in  the  consideration  of  standards  and  especially  so 
for  DBMS  standards.  The  Task  Group  concluded  that  a  ten 
year  life-cycle  was  the  proper  timeframe  in  which  to  con- 
sider DBMS  standards.  One  to  two  years  may  be  required  for 
the  development  of  the  standard  specifications.  This  time 
is  not  included  in  the  ten  year  period.  Another  important 
time  consideration  is  the  time  from  the  availaility  of  the 
specifications  to  the  availability  of  the  first  implementa- 
tion. This  period  may  also  be  one  to  two  years.  The  ten 
year  period  will   include  periodic  reviews. 

Development  of  totally  new  DBMS  specifications  would 
require  significantly  longer  periods  before  the  preparation 
of  specifications  and  the  availability  of  useful  products. 

2.3     WHAT  ARE  THE  ALTERNATIVES? 

In  deciding  which  approach  to  take  in  standardizing 
DBMS,  consideration  must  be  given  to  existing  products,  the 
feasibility  of  developing  totally  new  products,  and  the  work 
of  other  standards  bodies.  At  the  same  time,  the  feasibili- 
ty of  implementation,  the  timeframe  of  the  standard's 
development,  the  possible  longevity  of  the  standard,  and  the 
manner  in  which  this  rather  large  subject,  DBMS,  is  divided 
into  discrete  components  will  also  affect  the  decision. 

2.3.1  Alternatives  To  Any  DBMS  Standards.  Several  alterna- 
tives  to  standards  do  exist: 

1.  Effective  conversion  tools  would  reduce  the  need  for 
standards.  However,  "Data  Base  Directions  II--the 
Conversion  Problem"  [BERG  78]  reported  the  findings 
of  a  group  of  experts  which  stated  that  conversion 
technology  has  a  need  for  standards  in  the    area  of 
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data  interchange  formats.  Given  such  a  standard,  a 
translator  that  quickly  and  cheaply  converted  any 
database  to  any  other  database  might  eliminate  the 
need  for  standards. 

2.  Data  Dictionary/Directory  standards  may  result  in 
lessening  the  need  for  DBMS  standards  by  providing 
common  descri pti ons  f or  all   DBMS  to  use. 

3.  The  Data  Description  Language  used  to  provide, 
bridge,  or  interface  with  several  Data  Manipulation 
Languages  may  permit  a  multiplicity  of  DML's  while 
providing  many  of  the  standard's  objectives. 

4.  For  smaller  agencies,  the  availability  of  a  single 
time-sharing  DBMS  may  be  used  to  satisfy  the  stan- 
darization  objectives  for  Federal  applications  re- 
quiring the  interchange  of  data  and  programs. 

2.3.2  Standards  Other  Than  Federal.  The  Federal  government 
could  adopt  standards  from  other  bo dies  or  take  advantage  of 
existing  de  facto  standards.     These  include: 

0     ANSI  Standards 

0  De  Facto  Vendor  Standards  --  in  the  absence  of  other 
standards,  the  vendor's  practice  of  making  the  same 
product  available  to  all  of  its  customers  simplifies 
its  task  of  maintaining  and  correcting  the  software 
it  supports.  This  leads  to  a  general  compatibility 
between  users  of  the  same  vendor  systems.  General- 
ly, users  groups  have  developed  to  exploit  this  com- 
patibility through  the  interchange  of  information 
among  the  users.  However,  such  de  facto  vendor 
standards  are  under  the  control  of  the  vendor  and 
subject  to  changes  needed  to  meet  vendor  goals. 
While  vendors  are  sensitive  to  customer  needs,  ex- 
perience has  shown  that  such  changes  have  occurred. 

0  Proprietary  system  issues  --  Owners  of  proprietary 
systems  may  not  wish  to  give  up  ownership  or  be  un- 
able to  provide  precise  specifications  of  existing 
systems.  At  the  same  time,  the  competitive  nature 
of  the  computer  industry  may  result  in  a  gradual 
development  of  proprietary  systems  toward  the  stan- 
dard [BROC  76]. 

0     International  Standards 
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0     Individual  Government  Agency  Standards 

0  Functional  Application  Area  Standards  For  example: 
Law  enforcement,  health,  library,  air  transporta- 
tion, etc. 


2.4     FUTURE  OF  DBMS  TECHNOLOGY 

TG-24  concluded  after  a  review  of  pertinent  references 
that  database  technology  would  continue  to  be  characterized 
by  significant  and  important  work.  The  Task  Group  agreed 
with  those  who  see  change  as  inevitable  over  the  next  ten 
years.  However,  the  Task  Group  concluded  that  change  would 
be  evolutionary  and  assumes  that  no  "breakthroughs"  will 
probably  occur  in  that  time  period.  Further,  TG-24  conclud- 
ed that  proper  standards  planning  will  enhance  the  ability 
of  Federal  agencies  to  cope  with  breakthroughs  if  they 
shoul d  occur. 


2.5     HOW  ARE  DBMS  STANDARDS  JUSTIFIED? 

The  first  step  in  any  DBMS  standardization  effort  is 
developing  a  proposal  for  a  particular  specification.  The 
next  step  is  to  justify  the  commitment  of  resources  to 
develop  the  specification.  TG-24  used  a  combination  of 
quantitative  and  qualitative  analysis  to  justify  its  recom- 
mendations. The  Task  Group  began  by  systematically  identi- 
fying the  costs  and  benefits  of  DBMS  standardization  and  do- 
cumenting the  underlying  assumptions.  This  work  assisted 
TG-24  in  developing  aset  of  priorities  that  led  to  the  con- 
sideration of  an  interrelated  set  of  DBMS  components  which 
it  identified  as  a  "family  of  standards." 

The  work  will  also  assist  those  developing  the  specifi- 
cations to  compare  actual  experience  against  the  assumptions 
in  order  to  guide  the  specification  development  effort. 

The  difficulty  in  finding  quantifiable  costs  led  the 
Task  Group  to  a  method  depending  primarily  on  qualitative 
assessments.  Therefore,  the  justification  in  the  Recommen- 
dation Section  is  presented  in  qualitative  terms. 
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3.     RECOMMENDATIONS  FOR  STANDARDS 


3. 1     APPROACH  AND  OVERVIEW 


3.1.1  DBMS  Components.  Dividing  DBMS  into  functional  modules 
which  together  would  meet  the  diverse  needs  of  the  Federal 
agencies  provided  the  basis  for  organizing  the  TG-24  stan- 
dards investigation  and  recommendations.  The  components 
that  TG-24  chose  as  requiring  standards  considerations  are: 

0  Data  Dictionary/Directory  Facilities 

0  Data  Description  Facilities 

0  Data  Manipulation  Facilities 

0  Query  and  End-User  Facilities 

3.1.2  Structure  of  Recommendations.  Each  of  the  components 
considered  is  presented  in  a  form  indicated  by  the  following 
outl i  ne . 

I.  Background  (problem  statement,  scope  and  definition 
major  approach  and  methodology) 

II.  Recommendations  (stated  briefly  and  clearly) 

III.  Justifications 

(a)  Assumptions 

(b)  Benefits  and  Cost  Avoidance 

(c)  Costs  (standard  implementation,  maintenance 

and  usage) 

(d)  Di  scussions 

All  technical  details  and  figures  will  appear  in  the  ap- 
pendices. 

3.1.3  General  Assumptions.  In  listing  the  assumptions  to 
justify  each  of  its  recommendations  for  each  component, 
TG-24  found  that  some  assumptions  were  common  to  all  or  to 
many  of  them.  Therefore,  a  list  of  general  assumptions  that 
are  applicable  to  the  Federal  DBMS  standardization  effort  as 
a  whol e  f ol 1 ow : 
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0    Life  cycle  of  database  standard  is  ten  years. 

0    The  use  of  DBMS  will  continue  to  increase. 

0  The  databases  will  grow  in  size,  number,  and  com- 
plexity within  the  next  ten  years. 

0  Changes  in  h'ardware  and  software  over  the  next  ten 
years  will  have  little  effect  on  the  basic  functions 
performed  by  data  base  systems. 

0  Present  DBMS  architectures  will  be  useful  even  in 
emerging  new  system  environments  (such  as  distribut- 
ed systems)  and  changes  to  the  DBMS  architectures 
will  be  evolutionary  over  the  next  ten  years. 

0  There  will  be  more  DBMS  conversions  within  each  of 
the  Federal  organizations. 

0  Transfer  of  data  among  Federal  agencies  within  leg- 
islated guidelines  will  be  increasing.  Cost  and 
benefits  considered  will  be  limited  to  those  of 
Federal  agencies. 

3.1.4  General  Benefits.  The  general  benefits  that  can  be 
i  d  ent  i  f i  ed  for  DBMS  standardization  as  a  whole  are  as  fol- 
1  ows : 

1.  Easier  data  conversion. 

2.  Easier  program  conversion. 

3.  Improved  personnel  transferability. 

4.  Improved  data  sharing  among  different    computer  in- 
stal 1  at i  ons  . 

5.  Improved  DBMS  competition  among  vendors. 

6.  Improved  selection  and  evaluation  of  DBMS  products 

7.  Improved  DBMS     procurement     process    through  larger 
vendor  choice  and  competition. 

3.1.5  General  Cost  Considerations.  The  general  cost  con- 
siderations  pa ral  lei  the  set  of  common  standards  require- 
ments discussed  in  the  section  dealing  with  the  purpose  and 
approach  of  TG-24.     These  are  the  costs  of: 
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1.  Developing  a  standard  specification.  Such  a  specif- 
ication can  be  obtained  from  volunteer  organiza- 
tions, a  government  agency,  vendors  releasing  their 
proprietary  rights,  or  from  a  body  established  to 
provide  the  specifications. 

2.  Publishing  the  specification  and  analyzing  the  com- 
ments received. 

3.  Validating  the  specifications  and  mediating  the 
differing  interpretations. 

4.  Validating  the  implementations  for  conformance  to 
the  specifications. 

5.  Maintaining  specifications  over  standards'  life 
spa  n . 

While  some  of  these  costs  are  one-time  costs,  several  are 
on-going  costs  over  the  life  cycle  of  the  DBMS  standard.  In 
addition  to  the  costs  associated  with  developing  the  stan- 
dard specification,  some  costs  can  be  identified  with  in- 
stalling the  standard  DBMS  component. 

Note  that  costs  associated  with  converting  to  a  stan- 
dard can  be  avoided  by  requiring  use  of  the  standard  only 
when  another  form  of  change  forces  a  conversion.  Then  the 
conversion  to  the  standard  has  no  direct  costs.  For  exam- 
ple, first  time  users  selecting  a  data  dictionary  system 
would  probably  pay  no  more  for  a  data  dictionary  that  pro- 
duces output  compatible  with  standard  database  systems  then 
they  would  for  a  non-standard  dictionary  system. 

Similarly,  costs  resulting  from  the  conversion  to  im- 
proved hardware  or  software  capabilities  may  be  timed  to  in- 
clude changes  to  standard  practices.  Agencies  may  shift  to 
standard  practices  in  a  piecemeal  fashion  as  existing  non- 
standard practices  require  significant  changes. 

Costs  associated  with  installing  a  standard  DBMS  com- 
ponent when  the  above  considerations  are  ignored  include: 

0     Personnel -trai ni ng  and  temporary  loss  of  productivi- 
ty 

0     Conversion  of  data 

0    Translation  costs  of  programs 

Cost  will  vary  with  the  degree  of  difference  between  the 
non-standard  system  and  the  standard. 
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3.2  TERMINOLOGY 


3.2.1  Background .  The  Federal  government  currently  uses  many 
different  data  base  management  systems.  Each  of  these  sys- 
tems is  unique  in  that  its  developer  has  adopted  terms  to 
represent  syntactic  and  semantic  entities  which  fit  his  own 
environment.  For  this  reason,  many  similar  data  management 
functions  are  described  in  terms  which  are  quite  dissimilar. 
In  the  same  fashion,  many  terms  which  are  shared  between 
several   DBMS  may  have  quite  different  meanings. 

TG-24  recognized  that  its  members  used  DBMS  terms  in  a 
dissimilar  manner.  Consequently,  the  terminology  appearing 
in  Appendix  3  of  this  report  was  developed  to  help  TG-24 
members  in  their  written  and  oral  communications.  This  ter- 
minology definition  is  not  intended  to  be  a  standard. 

Work  to  date  by  ANSI  has  established  a  standard  glos- 
sary of  data  processing  terms  [ANSI  77].  The  absence  of 
data  base  terms,  however,  is  quite  pronounced.  The  Interna- 
tional Standards  Organization/Technical  Committee  97  is 
currently  involved  in  establishing  such  a  glossary. 

3.2.2  Recommendations. 

1.  A  standard  set  of  data  base  oriented  terms  should  be 
established,  should  be  coordinated  with  the  work  of 
ISO,  and  should  be  included  in  the  current  ANSI 
standard  data  processing  glossary. 

2.  Guidelines  should  be  established  which  encourage 
DBMS  developers  to  use  this  glossary  when  describing 
database  concepts. 

3.  Guidelines  should  be  established  whereby  vendors  are 
encouraged  to  utilize  DBMS  language  syntax  which  is 
compatiblewith  this  glossary. 

3.2.3  Ju  St i  f i  c at i  on  . 
++As  sumpt i  on  s  . 

1.  The  proliferation  of  DBMS  will  foster  invention  of 
unique  DBMS  terms  for  similiar  concepts. 

2.  Terminology  standards  will  continue  to  be  developed 
by  ANSI  and  ISO. 

++Benefits  and  Cost  Avoidance.  Benefits  to  be  derived  from 
establishing  a  common  set  of  database  oriented  terms  can  be 
seen    mainly    in    the    areas    of    training     and  information 
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interchange.  A  terminology  standard  would  permit  prelim- 
inary training,  (e.g.  university,  technical  school)  to  ad- 
dress general  knowledge  of  database  concepts,  these  concepts 
being  embodied  in  the  database  glossary.  Training  at  more 
specific  levels  could  therefore  be  shorter  in  duration  and 
more  productive. 

In  the  same  fashion,  interchange  of  information  between 
individuals  using  different  database  systems  can  be  facili- 
tated. Different  features  can  be  related  to  the  glossary  of 
terms,  thereby  providing  a. mapping  of  information  about  one 
system  onto  another. 

++Co St s .  The  cost  involved  in  establishing  and  maintaining 
a  terminology  standard  is  relatively  small  since  Federal 
agencies  would  rely  on  voluntary  development.  The  pay-back 
for  this  effort  would  begin  immediately  and  would  easily 
justify  the  expense  in  establishing  the  standard. 


3.3    DATA  DESCRIPTION  LANGUAGE 


3.3.1  Background.  The  data  description  language  (DDL)  is  de- 
fined  as  astand-alonelanguage  that  describes: 

0    Attributes  of  data  elements 

0    Logical   relationships  among  units  of  data  (records, 
set s  ,  etc . ) 

0    Logical   structure  of  the  database 

0    Logical  methods  of  access  to  data 

The  DDL  does  not  include  the  definition  of  physical 
storage  media. 

The  Federal  Government  currently  uses  a  large  number  of 
database  management  systems.  Often  a  single  agency  supports 
more  than  one  DBMS  in  order  to  satisfy  varying  user  needs. 
Each  DBMS  has  its  own  language  for  describing  the  attributes 
and  relationships  of  the  stored  data.  This  variety  of  data 
descriptions  makes  it  impossible  for  agencies,  or  even  dif- 
ferent units  within  an  agency,  to  easily  interchange  data. 
The  people  who  are  interested  in  using  the  data  are  forced 
to  learn  the  specific  DDL  which  describes  it.  When  data  is 
exchanged,  a  new  description  of  data  for  the  target  system 
must  be  written  and  tested  before  the  data  can  safely  be 
used . 
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The  current  situation  is  both  costly  and  time  consum- 
ing. Therefore,  the  recommendations  are  aimed  at  reducing 
both  the  time  and  cost  required  to  exchange  data  between 
different  users. 


3.3.2  Recommendations. 


1.  A  two-part  data  description  standard  is  required 
that  contains  the  common  description  of  the  data 
element  (attributes)  and  the  facility  to  describe 
multiple  data  structure  classes. 

2.  The  specification  of  the  standard  description  of 
data  attributes  should  be  similar  to  the  attributes 
in  the  PICTURE  and  TYPE  clauses  of  the  CODASYL  DDL. 

3.  The  specification  of  the  standard  description  of 
data  structures  should  be  required  to  encompass 
current  data  models  such  as  the  hierarchical,  net- 
work, and  relational. 

4.  Consistent  with  recommendation  1,  the  standard 
description  of  the  network  data  structure  within  the 
DDL  should  be  based  on  the  CODASYL  DDL. 

5.  Consistent  with  recommendation  1,  companion  data 
structure  descriptions  for  the  hierarchical  and  re- 
lational  data  models  should  be  developed. 


++Expl anat i  on .  The  separation  of  the  description  of  data 
element  attributes  from  the  description  of  data  structure 
has  been  recommended  because  a  single  standard  for  both  will 
not  serve  the  needs  of  the  Federal  community.  Separation  of 
data  element  descriptions  from  data  structure  descriptions 
allows  maximum  flexibility  while  benefiting  from  the  advan- 
tages of  standardization.  It  would  provide  a  single  stan- 
dard for  data  element  descriptions  and  multiple  standards 
for  data  structure  descriptions.  The  concept  of  separation 
has  been  advanced  by  such  noted  authorities  in  the  field  as 
E.F.  Codd,  M.E.  Senko,  and  James  Martin.  [CODD  71,  SENK  73, 
MART  77] 

The  attributes  required  to  describe  data  elements  by 
any  DBMS  are  similar,  although  the  syntax  and  semantics  used 
are  often  very  different.  A  single  standard  in  this  area 
would  simplify  the  user's  task  in  describing  the  data  ele- 
ments contained  in  the  data  base;  it  was  felt  that  the 
choice  of  syntax  used  to  describe  data  element  attributes 
was  irrelevant.  The  attributes  of  the  CODASYL  DDL  were  sug- 
gested because: 
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1.  They  are  well  documented  with     a     formal  specifica- 
tion. 

2.  They  are  supported  by  a  maintenance  group. 

3.  They  are  an  extension  of  COBOL,  a  widely    used  com- 
puter language. 

4.  There    are     currently     several     DDL  implementations 
based  on  the  CODASYL  proposal . 

In  the  data  structure  area,  both  the  characteristics 
and  the  seman t i c s/ synt ax  required  to  describe  it  are  dif- 
ferent. Three  discrete  data  models  have  been  identified, 
i.e.,  hierarchical,  network,  and  relational.  Each  of  these 
models  has  unique  characteristics  that  require  a  unique 
structure  description.  A  single  standard  would  impose  one 
of  the  above  models  on  all  Federal  agencies  regardless  of 
user  requirements.  For  example,  a  user  with  an  ad-hoc  re- 
trieval requirement  who  may  best  be  served  by  a  relational 
data  model  would  be  needlessly  constrained  if  a  network  were 
adopted  as  the  single  standard. 

3.3.3  Justification. 

++As  sumpt i  on  s . 

0  The  amount  of  data  description  in  the  Federal 
Government  will  continue  to  grow. 

0  The  amount  and  complexity  of  data  stored  in  database 
structures  by  the  Federal  Government  will  grow  sig- 
nificantly. 

0  The  increase  in  DBMS  usage  will  cause  an  increase  in 
DDL  training  requirement. 

0  The  demand  for  interchanging  of  data  between  DBMS 
will  grow. 

++Benefit  and  Cost  Avoi dance.  Savings  will  be  mainly  in  the 
area  of: 

0  Personnel  -  After  the  initial  investment  of  re- 
educating all  personnel  in  the  use  of  the  standard, 
DBMS  retraining  costs  would  be  minimized.  Addition- 
al costs  would  be  avoided  by  reducing  cost  of  new 
training,  and  reducing  the  cost  of  low  productivity 
coupled  with  high  error  rates  during  the  period  of 
developing  expertise  in  the  use  of  the  new  DBMS. 
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0  Transfer  of  Data  Descriptions  -  Cost  of  transfer  of 
DDL  between  source  and  target  DBMS  would  be  minim- 
ized because  the  DDL  of  the  source  DBMS  could  be 
directly  compiled  on  the  target  DBMS.  Opportunity 
will  exist  for  more  automated  means  of  converting 
between  different  DBMS. 

0  Transfer  of  Data  -  Transfer  of  data  would  be  simpli- 
fied because  the  use  of  the  standard  DDL  would  allow 
a  two  step  conversion.  The  source  data  would  be 
converted  to  an  intermediate  file.  The  intermediate 
file  would  be  loaded  on  the  target  DBMS  using  the 
same  DDL.  Target  users  would  have  no  difficulty  in 
ut  i 1 i  zi  ng  the  dat a . 

There  are  other  equally  important  considerations  that 
are  not  easily  quantifiable.  These  are  timeliness  and  qual- 
ity of  data.  The  disruption  felt  by  end  users  during 
conversion  is  difficult  to  measure  but  is  a  real  factor. 

The  standardization  of  DDL  would  encourage  transfer  of 
data  between  users.  Currently,  the  time  and  cost  involved 
in  data  transfer  forces  users  to  do  without  needed  data,  or 
to  duplicate  collection  and  maintenance  of  data  existing 
elsewhere.  The  report  of  the  Federal  Paper  Commission  cited 
duplicate  collection  of  data  as  a  serious  government-wide 
probl em . 

Standardization  of  DDL  would  also  encourage  wide  use  of 
DBMS  in  the  Federal  Government.  Many  agencies  are  hesitant 
to  get  "locked  in"  to  a  non-standard  DBMS  causing  them  to 
use  far  less  efficient  means  of  storing  and  accessing  data. 

Standardization  of  DDL,  as  recommended,  would  contri- 
bute both  quantifiable  and  non-quantifiable  cost  savings  in 
the  Federal  Government.  In  addition,  efficient  information 
processing  methods  would  be  fostered. 

++Co St s .  The  Federal  government  should  encourage  and  parti- 
cipate in  voluntary  standard  action.  However,  in  order  to 
effect  the  intended  goals  the  Federal  government  should  be 
prepared  to  accept  the  costs  of  developing  and  maintaining 
the  Data  Description  Language  specification.  The  costs  will 
be  initially  the  development  of  the  specification,  whether 
in  voluntary  standards  groups  or  as  a  Federal  effort,  and 
the  subsequent  maintenance  of  the  specification. 
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3.4    DATA  MANIPULATION  LANGUAGE 


3.4.1  Background.  Data  Manipulation  Language  (DML)  is  the 
1 anguage  which  a  programmer  uses  to  cause  data  to  be 
transferred  between  a  program  and  the  database.  The  DML  may 
not  be  a  complete  language  by  itself.  It  may  rely  on  a  host 
programming  language  to  provide  the  procedural  capabilities 
required  to  manipulate  data.  A  user  application  program  is 
written  using  a  mixture  of  host  programming  language  state- 
ments and  the  DML  commands.  The  DML  provides  the  ability  to 
interact  with  the  database  by  giving  commands  to  cause  an 
action  and  by  providing  a  means  to  receive  responses  from 
the  database  or  the  processors  involved.  The  DML  consists 
of  several  data  manipulation  commands  or  functions  which  may 
include  the  following:  data  retrieval,  data  addition,  data 
modification,  data  deletion  and  modifications  of  data  rela- 
tionships. 

The  diversity  of  data  manipulation  functions  and 
languages  causes  many  problem  areas  which  would  benefit  from 
standardization.  The  following  problems  occur  with  current 
DBMS  technology  when  users  must  contend  with  two  or  more 
DBMS. 

0  There  exists  a  different  (non-standard)  DML  for  each 
DBMS. 

0  The  data  manipulation  functions  of  each  DML  differ 
in  syntax  and  semantics. 

0  The  data  manipulation  functions  also  differ  in  syn- 
tax and  semantics  between  the  various  host  program- 
ming languages  for  a  given  DBMS.  Specifically,  the 
DML  for  the  FORTRAN  language  interface  is  different 
from  the  COBOL  language  interface. 

0  The  functional  capabilities  provided  by  DML  are  dif- 
ferent across  the  set  of  available  DBMS.  This  causes 
problems  in  conversion  and  procurement. 

TG-24  evaluated  the  DML  of  twelve  of  the  more  commonly 
used  DBMS  and  discovered  four  groupings.  Chart  I  in  Appendix 
5  shows  the  twelve  DBMS  and  their  DML.  Charts  A,  B,  C,  and  D 
show  the  four  categories  of  DBMS  identified,  and  examples  of 
the  DML  for  each  category. 

TG-24  developed  the  concept  "category"  to  help  it 
analyze  potential  standards.  The  term  "category"  is  an  ad- 
hoc  concept  used  merely  for  the  purposes  of  our  analysis. 
Categories  are  groupings  of  DBMS  based  on  some  technical 
criteria.  The  criteria  used  by  TG-24  to  categorize  DBMS  were 
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the  commonality  of  data  manipulation  functions  and  data 
structures.  Categories  should  not  be  equated  to  data  models, 
although  data  structures  are  used  as  a  major  criterion  to 
categorize  data  base  systems.  The  term  "data  model"  refers 
to  the  type  of  data  structuring  permitted  by  a  DBMS  (e.g., 
network,  hierarchical,  relational,  ..)  and  various  data 
models  appearing  in  the  literatures  may  have  only  slight 
differences.  See  for  example  [MART  77]  and  [KERS  76]  for  a 
discussion  of  different  data  models. 

Though  the  concept  of  "categories"  has  special  utility 
for  TG-24's  analysis,  the  idea  behind  this  method  may  be  of 
use  to  those  who  will  be  performing  the  follow-on  detailed 
work.  A  more  comprehensive  analysis  to  categorize  DBMS  may 
require  additional  criteria.  TG-24  does  not  wish  to  restrict 
the  methodology  used  to  categorize  DBMS,  but  the  intent  of 
the  analysis  should  be  to  support  the  goal  of  standardiza- 
tion by  eliminating  smaller  differences,  combining  similar 
functionality,  and  producing  the  smallest  number  of  ca- 
tegories. 

The  following  recommendations  assume  four  categories  of 
DBMS  with  at  most  one  standard  DML  for  each  host  programming 
language  for  each  category  of  DBMS  and  a  potential  of  one 
standard  DML  for  al 1  DBMS. 


3.4.2  Recommendations. 


1.  Develop  multiple  DML  standards  specifications, 

2.  (a)  For  each  standard  host  programming  language 
(e.g.,  COBOL,  FORTRAN),  develop  immediately  a  stan- 
dard DML  specification  for  each  category  identified 
by  TG-24. 

(b)  As  a  short  range  goal,  develop  a  single  standard 
DML  specification  for  a  given  category  that  inter- 
faces with  all   standard  host  programming  languages. 

(c)  As  a  long  range  goal,  develop  a  single  standard 
DML  specification  containing  the  functionality  of 
all  categories. 

3.4.3  Justification. 

++As sumpti  on s . 
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2.  Increasing  numbers  of  application  programs  will  be 
written  using  the  DML. 

3.  The  present  interrelationships  between  data  defini- 
tion languages  and  data  manipulation  languages  will 
remain  the  same  in  the  next  ten  years. 

4.  The  existing  differences  present  in  the  host  pro- 
gramming language  interfaces  are  not  necessary  to 
provide  the  needed  data  manipulation  functionality. 

5.  The  increasing  number  of  computer  conversions  anti- 
cipated within  the  Federal  government  will  require 
reprogrammi ng  a  large  number  of  application  programs 
using  DML. 

6.  Data  manipulation  functions  and  data  structures  are 
valid  criteria  for  categorizing  DBMS. 

++Benefits  and  Cost  Avoidance. 

0  A  greater  return  from  (or  a  reduction  of)  required 
resources  for  programmer  training,  program  conver- 
sion, data  transfer,  and  application  system  transfer 
will  occur  with  standard  DML. 

0  Limiting  the  number  of  standard  syntactic  and  seman- 
tic definitions  for  data  manipulation  functions  will 
simplify  the  terminology  differences  and  make  it 
possible  to  evaluate  the  capabilities  of  DBMS. 

0  The  evaluation,  selection,  and  procurement  of  DBMS 
will  be  simplified  by  the  standardization  of  a  lim- 
ited number  of  data  manipulation  languages. 

0  The  proposed  standard  DML  interface  will  simplify 
data  sharing  in  computer  networks,  community  data 
bases,  and  distributed  databases. 

++Cost_s.     The  following  factors  affect  the  standard's  cost: 

0  Increasing  the  number  of  categories  and  the  number 
of  host  programming  language  interfaces  will  signi- 
ficantly increase  the  cost  of  specifying  and 
developing  the  family  of  standard  DML. 

0  Conversion  cost.  A  cost  will  be  associated  with  con- 
verting existing  DML  to  standard  DML.  The  cost  will 
vary  with  the  differences  between  existing  DML  and 
the  appropriate  standard. 

However,  conversion  costs  can  be  avoided  by: 
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0     installing  the  standard  only  for  new  applications. 

0     delaying  adoption  of  the  standard     until  conversion 
is  required  by  other  factors. 

++D  i  sc  uss i  on .  The  first  of  the  recommendations  posed  above 
a  1 1 ows  for  the  development  of  multiple  DML  standards.  TG-24 
seeks  to  permit  flexibility  in  DBMS  standards  by  providing  a 
family  of  standards.  Such  an  approach  will  allow  an  overall 
database  architecture  to  exist  that  will  provide  useful 
functions  and  data  structures  that  are  not  currently  provid- 
ed by  any  single  DBMS  using  a  single  DML  standard. 

The  second  recommendation  is  a  gradual  movement  towards 
a  desirable  end  goal.  It  would  be  very  beneficial  to  the 
DBMS  end-user  community  if  one  standard  DML  was  possible  and 
practical.  TG-24  felt  that  the  standardization  effort  could 
not  begin  with  the  single  DML  standard  route,  but  the  effort 
may  evolve  to  that  end.  There  are  major  differences  between 
the  data  structures  and  DML  provided  by  various  DBMS  today. 
For  this  reason,  if  the  DBMS  continue  to  evolve,  it  is  very 
possible  that  the  major  differences  between  DBMS  will  disap- 
pear   and     all  DBMS  will   provide  similar  major  capabilities. 

This  would  allow  a  single  DML  to  be  practical. 
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Figure  1  -  Relationship  of  Host  Language  Standards 

to  DML  standards 

The  matrix  in  figure  1  illustrates  the  DML  recommenda- 
tions. Each  of  the  categories  listed  down  the  left  hand 
edge  of  the  matrix  provides  the  user  with  a  useful  viewpoint 
of  the  database.  Across  the  top  of  the  matrix  appear  exam- 
ples of  existing  standard  languages.  As  indicated  by  the 
matrix,  each  intersection  of  row  and  column  provides  an  op- 
portunity for  a  specific  DML.  For  example,  the  CODASYL  da- 
tabase specifications  would  be  a  candidate  for  the  intersec- 
tion of  COBOL  and  the  network  category  but,  of  course,  would 
not  satisfy  the  COBOL /rel at i onal  intersection.  Similarly, 
CODASYL  has  proposed  a  FORTRAN  DML  which  is  a  candidate  for 
filling  the  network  category  and  FORTRAN  intersection. 

The  DML  recommendation  identified  each  of  the  intersec- 
tions as  a  potential  standard.  However,  TG-24  also  noted  the 
inherent  commonalities  that  would  exist  in  all  DML  for  any 
particular  category.  Save  for  differences  of  syntax  re- 
flecting the  specific  host  language,  the  functions  performed 
essentially  remain  the  same  over  all  the  languages.  One  DML 
independent  of  any  particular  language  could  satisfy  all  the 
host  languages  for  any  particular  category.  The  matrix  in- 
dicates this  with  final  right  hand  column  which  contains  a 
DML  for  each  category. 
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Finally,  TG-24  considered  the  fact  that  all  the  dif- 
ferent categories  were  treating  the  same  important  data. 
The  concept  of  a  superset  of  data  manipulation  functions 
common  to  all  the  categories  but  bundled  to  eliminate 
differences  of  syntax,  data  structures,  or  a  particular 
viewpoint  is  not  new.  Several  technical  papers  have  treated 
the  subject  of  "reconciling"  various  data  models  into  a  com- 
mon language  for  the'  user.  This  lead  to  the  conceptual  pos- 
sibility of  "summing"  the  host  language  independent  DML  in 
the  right  hand  column  into  the  one  common  DML  found  at  the 
bottom  of  the  column. 

The  process  of  abstracting  the  various  DML  into  a 
category-common  DML  and  then,  further,  into  a  single,  common 
DML  is  admittedly  speculative  and  requires  demonstrations  of 
feasibility.  However,  such  an  approach  would  support 
directly  the  goals  of  standardization  and  TG-24  recommended 
its    i  nvest i  gat i  on . 


3.5     DATA  DICTIONARY/DIRECTORY  FACILITY 


3.5.1  Background.  A  data  element  d i cti onary/ di rectory  (to  be 
referred  to  in  short  as  data  dictionary  (DD))  is  a  software 
tool  that  is  used  to  identify  and  interrelate  data  elements 
within  an  application  or  enterprise.  It  is  viewed  as  the 
central  repository  of  all  descriptive  information  about  each 
data  element  contained  within  an  application  database. 

The  scope  of  the  standardization  recommendations  ex- 
cludes data  dictionaries  that  are  manual  tools  for  data 
resource  management.  In  Appendix  6,  the  various  types  of 
data  dictionaries,  the  various  features  that  a  typical  data 
dictionary  would  provide,  and  some  commercial  package  names 
are  mentioned  for  illustrative  purposes. 

3.5.2  Recommendations . 

1.  Data  dictionaries  used  by  Federal  agencies  must  be 
able  to  produce  the  standard  DDL  attribute  descrip- 
tion as  recommended  in  Section  3.3.2  by  the  DDL  Sub- 
committee of  TG-24.  (TG-24  took  no  position  on  the 
standardization  of  data  dictionaries  but  addressed 
only  those  data  dictionaries  with  an  interface 
between  the  data  dictionaries  'and  the  DBMS  which 
must  be  standardized.) 

2,  The  design  of  data  dictionary  must  be  capable  of 
combining  standard  data  attribute  descriptions  with 
data  structure  descriptions  to  generate  the  DDL  for 
one  or  more  DBMS. 
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3.     Establish  guidelines  on  the  data  dictionary  usage. 


3.5.3  Justification. 
++As sumpt i  on s . 

0  As  databases  continue  to  grow,  the  need  to  control 
data  elements  descriptions  will  increase. 

0  Data  dictionary  usage  in  the  Federal  agencies  will 
increase  at  a  greater  rate  than  the  usage  of  DBMS 
within  the  next  10  years. 

0  DD  will  be  a  critically  important  tool  used  by  data 
base  administrators  for  the  central  control  of  data- 
bases. 

0  Distributed  database  management  will  enhance  the 
need  of  data  dictionaries. 

++Benefits  and  Cost  Avoidance. 

0  A  data  dictionary  which  produces  standard  DDL  will 
reduce  the  cost  of  converting  to  the  standard  DDL. 

0  The  standard  DDL  produced  by  the  DD  to  be  interfaced 
to  DBMS  will  reduce  the  need  for  manual  coding  which 
in  turn  will   reduce  errors. 

0  There  will  be  significant  cost  savings  for  tran- 
sporting data  for  data  interchange  and  for  conver- 
sion purposes  because  a  standard  DDL  would  be  gen- 
erated from  the  data  dictionary. 

0  Data  dictionaries  will  provide  bridges  to  multiple 
DBMS. 

++Co_st_.  The  cost  of  implementing  this  recommendation  will 
be  no  more  than  the  cost  of  implementing  a  data  dictionary 
that  does  not  produce  a  non-standard  DDL. 

++Di  scussions.  Note  that  the  recommendation  proposed  for 
the  data  dictionary  is  to  standardize  the  interface  between 
the  DD  and  the  DBMS.  Compliance  with  this  standard  will  per- 
mit the  users  of  DBMS  to  use  any  data  dictionary  that  has 
implemented  the  standard  interface  at  no  further  cost. 
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3.6     QUERY  LANGUAGE/END  USER  FACILITIES 


3.6.1  Background . 

There  currently  exists  a  tremendous  variety  of  end  user 
facilities  for  DBMS,  and  a  substantial  variety  of  taxonomies 
for  these  facilities.  [LOUG  77].  Since  many  of  these  facil- 
ities are  quickly  learned,  and  many  of  them  are  designed  for 
ad  hoc  activity,  standardization  of  syntax  and  semantics  ap- 
pears unwarranted.  Substantial  costs  may  be  incurred  by  a 
poor  match  between  the  facilities  needed,  and  those  provided 
by  a  procured  DBMS.  To  avoid  this,  a  match  should  be 
achieved  prior  to  procurement. 

3.6.2  Recommendations. 

1.  Standardization  of  syntax  and  semantics  of  end-user 
facilities  is  not  required.  Such  facilities  are 
easily  learned,  problem  and  subject-matter  depen- 
dent, and  there  are  a  diversity  of  end-user  facility 
"styles". 

2.  Guidelines  should  be  developed  to  aid  the  specifica- 
tion of  requirements  of  end-user  facilities  for 
Federal   procurement  purposes. 

3.6.3  Justification. 
++As sumpt i  ons . 

1.  End-user  facilities  will  be  the  subject  of  intensive 
investigation  and  rapid  technical  development,  e.g., 
the  role  of  intelligent  terminals,  is  just  begin- 
ning to  emerge. 

2.  A  variety  of  end-user  facilities  will  be  needed  to 
accommodate  the  different  needs  and  user  popula- 
tions. 

3.  The  problem  of  matching  end-user  facilities  to  re- 
quirements will  continue  to  be  difficult. 

4.  Programming  effort  will  continue  to  be  expanded  to 
improve  end-user  facilities  where  facilities  are 
inadequate  or  inconvenient  for  the  user  group. 
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++Benefits  and  Cost  Avoidance. 


1.  The  major  benefit  of  the  procurement  guideline  would 
be  improved  utility  of  DBMS  resulting  from  procure- 
ment of  end-user  facilities  which  best  fit  the  needs 
and  abilities  of  all  user  groups  which  are  to  use 
them . 

2.  The  major  cost  avoidance  resulting  from  the  guide- 
lines is  due  to  the  decreased  need  for  "customizing" 
the  end-user  facilities,  e.g.,  by  programming  new 
facilities  using  host  languages  and  the  data  manipu- 
lation language,  or  by  enhancing  existing  facili- 
ties. 

3.  Usage  of  the  guideline  will  aid  the  user  primarily 
during  the  procurement  process.  While  not  even  an 
approximate  estimate  of  the  number  of  likely  DBMS 
procurements  can  be  made  without  a  survey,  one  pro- 
curement error  is  likely  to  waste  easily  several 
months  time.  Additional  uses  of  the  guideline  would 
be  to  guarantee  retention  of  all  existing  functions 
during  conversions,  to  provide  guidance  in  upgrading 
existing  facilities,  and  to  assist  database  adminis- 
trators in  providing  the  appropriate  tools  to  dif- 
ferent user  groups. 

++Costs . 

0  Development  of  the  guideline  should  be  relatively 
inexpensive.  Maximum  use  should  be  made  of  existing 
studies  and  taxonomies  of  end-user  facilities,  and 
also  of  past  procurement  efforts  where  lists  of 
specific  requirements  were  used.  A  major  part  of 
the  development  effort  should  be  a  follow-up  on  the 
major  procurements  to  check  areas  where  experience 
after  procurement  indicates  facilities  were  lacking. 

•i-+Di  sc  us  si  on. 

0  General  comments.  There  are  many  examples  of  multi- 
ple DBMS  in  use  by  single  organizations.  A  plausi- 
ble explanation  for  this  phenomenon  is  that  the  dif- 
ferent DBMS  provide  different  sets  of  end-user  fa- 
cilities which  are  so  attractive  to  different  user 
groups  that  the  costs  of  multiple  DBMS  are 
outweighed  by  user  convenience.  It  may  well  be  that 
multiple  end-user  facilities  are  in  fact  required. 
It  does  not  follow,  of  course,  that  multiple  data 
dictionaries,  data  definition  languages,  and  data 
manipulation  functions  are  required  -  ideally  dif- 
ferent end-user  facilities  would  make  use  of  common 
system  parts.  The  guideline  should  therefore  clear- 
ly   indicate    that     procurement    officers    may  best 
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satisfy  their  requirements  by  purchasing  several 
end-user  facilities,  from  one  or  several  vendors, 
plus  possibly  developing  in-house  facilities  if  ap- 
propriate, all  of  which  use  the  same  interface  to 
other  DBMS  f acil ities. 


Comments  on  first  and  second  assumptions:  A  neces- 
sary and  sufficient  set  of  end-user  facilities  may 
eventually  be  defined  as  a  result  of  current 
research.  At  that  point,  a  standard  for  end-user 
facilities  should  be  developed.  These  do  not  appear 
to  be  surveyable  assumptions.  The  consensus  of 
TG-24  is  that  the  assumptions  will  remain  valid  for 
several  years. 

Comments  on  third  and  fourth  assumptions:  These  as- 
sumptions have  been  informally  verified  by  surveying 
the  experience  of  TG-24  members.  A  formal  survey 
might  be  appropriate  to  demonstrate  more  general 
val i  d  i ty . 
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4.     RECOMMENDATIONS  FOR  GUIDELINES 


4.1  BACKGROUND 

Several  areas  or  activities  associated  with  database 
management  systems  do  not  easily  lend  themselves  to  strict 
standardization.  These  activities  are  concerned  with  such 
functions  as  the  documentation  of  data  base  designs  and  im- 
plementations, ancillary  operations  performed  in  the  data- 
base environment,  and  the  establishment  of  database  adminis- 
tration functions.  This  indicates  the  need  for  guidelines  in 
addition  to  whatever  standards  may  be  necessary.  These 
quidelines  will  treat  subject  matter  that  are,  perhaps, 
premature  for  standardization,  or  offer  too  diverse  a  choice 
selection     to     permit     hard     and     fast      direction  imposed 
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externally,  or  permit  several  equally  good  choices  with  no 
particular  Federa  1 -w  i  de  benefit  a'chieved  by  pointing  to  a 
pa rt i cul ar  choice. 


4.2  RECOMMENDATIONS 

In  addition  to  those  guidelines  recommended  in  the 
specific  subject  areas  above,  TG-24  concluded  that  guide- 
lines recommending  standards  of  good  practice  were  needed  in 
the  following  areas: 


1.     Computer  security  and  privacy  protection  in  database 
systems . 


2.  Procedures  for  database  recovery,  reorganization  and 
audit.  At  this  time,  no  Federal  guidelines  exist.  A 
working  panel  report  on  database  auditing  [BERG  75] 
suggests  the  need  for  such  a  guideline.  A  group  of 
audit  experts  [RUTH  77],  supports  this  point,  and 
discusses  general  audit  administration,  methodology 
and  tools.  While  database  recovery  and  reorganiza- 
tion procedures  may  be  too  DBMS-specific  for  inclu- 
sion into  Federal  guidelines,  development  of  general 
guidelines  or  checklists  may  be  possible. 


3.  Documentation  -  Currently  FIPS  Pub  38,  "Guidelines 
for  Documentation  of  Computer  Programs  and  Automat- 
ed Data  Systems"  contains  guidelines  for  the 
description  of  database  specifications.  These  guide- 
lines describe  the  database  specification  for  a  da- 
tabase to  be  developed  during  the  design  stage  of 
the  development  phase  of  the  software  life  cycle. 
Guidelines  will  also  be  necessary  for  the  documenta- 
tion of  the  implemented  database,  i.e.,  after  it  is 
developed,  tested,  and  loaded.  This  would  be  analo- 
gous to  the  separate  guidelines  that  exist  for  the 
program  specification  and  the  program  maintenance 
manual.  The  "Database  Management  Manual"  might 
describe  the  database  in  its  final,  implemented 
form,  and  all  common  validation  routines,  recovery 
utilities,  reorganization  criteria  and  utilities, 
database  administration  support  software,  etc. 


4.  Performance  monitoring  -  An  important  start  has  been 
made  in  this  area  with  the  issue  of  FIPS  PUB  49, 
"Guideline  on  Computer  Performance  Management,"  in 
May    1977.     In     the    context  of  database  management. 
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guidelines  could  be  developed  for  the  monitoring  of 
database  growth  and  performance,  general  criteria 
for  reorganization,  database  (re)design  hints,  etc. 


DBMS  evaluation  and  selection  -  FIPS  guidelines  on 
several  activities  in  the 

analysis/evaluation/selection  cycle  would  greatly 
assist  Federal  DP  managers  considering  the  database 
approach : 

a.  Analysis  of  existing  ("conventional")  data  pro- 
cessing applications  or  potential  applications  (not 
yet  automated)  for  implementation  under  database 
t echnol ogy ; 

b.  Methods  for  the  comparison  of  available  DBMS 
packages  against  application  requirements  and  selec- 
tion of  best  candid  at e(s); 

c.  Documentation  practices  for  these  activities, 

d.  Preparation  of  Request  for  Procurements  -  a  need 
particularly  noted  by  several  managers  that  offered 
their  comments  to  the  Task  Group,  and; 

e.  Methods  for  benchmarking  DBMS  as  a  means  to  aid 
in  evaluation  of  the  responses  to  the  Request  for 
Procurement . 


Database  administration  functions  -  Several  existing 
publications  from  various  sources  could  provide  the 
basis  for  a  FIPS  guideline  for  the  determination  of 
the  functional  duties  of  Federal  database  adminis- 
tration staffs. 


Requirements  analysis  -  Little  guidance  now  exists 
to  assist  Federal  managers  in  preparing  a  -good 
statement  of  their  needs  prior  to  seeking  tools  to 
meet  these  needs.  Federal  managers,  particularly  in 
the  numerous  smaller  agencies,  need  such  help. 
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4.3  JUSTIFICATION 


4.  3. 1  Assumpti  ons. 


0    Useful   guidelines  can  be  compiled  from  existing  ma- 
terial on  good  information  practices. 


4.3.2  Benefits  and  Cost  Avoidance. 


0  Guidelines  can  provide  Federal  agencies  with  ap- 
proved practices  drawn  from  industry  experience  and 
tailored  to  the  special  needs  of  Federal  agencies. 
Examples  of  the  special  needs  of  Federal  agencies 
include:  procurement  practices  determined  by  regula- 
tions, statutory  constraints  of  data  processing  ac- 
tion, and  affirmative  action  programs  with  regard  to 
i  nf ormat i  on  pol i  cy . 


0  Guidelines  can  be  maintained  and  modified  to  reflect 
Federal  experience  with  them  so  that  each  agency  can 
profit  from  the  collective  experience. 


4.3.3  Costs.  Costs  associated  with  guidelines  are  the  direct 
cost  of  compiling  and  reporting  recommended  standards  of 
practice  as  well  as  that  work  needed  to  develop  and  test 
practices  not  available  from  industry  sources.  In  addition, 
cost  will  be  experienced  in  follow-up  and  verification  of 
published  guidelines. 
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5.     MISCELLANEOUS  RECOMMENDATIONS 


5.1     REDUCING  STANDARD  ADOPTION  COSTS 


5.1.1  Background .  To  force  the  adoption  of  the  standards 
recommended  in  this  report  universally  and  on  a  certain  date 
could  result  in  unnecessary  expense  to  Federal  agencies. 
Under  such  a  requirement,  conversion  costs  would  occur  in 
the  training  of  personnel,  the  conversion  of  existing  data- 
bases, and  the  translation  of  existing  application  programs. 
A  less  concrete  cost  would  be  the  loss  of  productivity  ex- 
perienced by  the  agency  during  this  period.  These  costs  can 
be  reduced  and  even  eliminated  by  judicious  timing  of  stan- 
dard installation. 


5.1.2  Recommendations. 


1.  The  standards  recommended  should  not  be  mandatory 
until  other  factors  determine  a  change  to  the 
rel evant  systems . 


2.  Standards  can  be  required  for  new  applications  or 
systems  without  requiring  existing  systems  to  also 
conform  to  the  standard. 


5.1.3  Justification. 
++As sumpt i  on s . 

1.  During  a  conversion  to  new  systems  requiring  changes 
to  programs  and  data,  the  use  of  standard  products 
adds  no  additional  costs  to  the  conversion  process. 

2.  New  applications  can  be  written  in  the  standard  pro- 
duct at  little  or  no  additional  costs.  Costs  associ- 
ated with  maintaining  two  concurrent  systems  will  be 
low  and  offset  by  reduction  in  conversion  costs  to 
move  to  the  standard  system  when  other  system 
changes    result    in     rewriting  the  existing  programs 


-34- 


and  converting  the  existing  databases. 


3.     Incremental    installation  of  standards  will   be  feasi 
ble. 

++Benefits  and  Cost  Avoidance. 


These  recommendations  will   reduce  conversion  costs. 


5.2     DATABASE  STANDARDS   IN  OTHER  MAJOR  AREAS 


5.2.1  Background.  TG-24  considered  certain  other  areas  where 
database  systems  were  major  considerations.  These  were  data- 
base systems  provided  by  computer  processing  services 
(whether  in  batch  or  in  the  time  sharing  environment),  dis- 
tributed data  processing  or  distributed  data  bases,  and  the 
mini/micro  computer.  TG-24  treated  these  issues  only  to  the 
extent  that  it  could  conclude  that  the  basic  recommendations 
would  apply  to  these  systems  as  well. 


5.2.2  Recommendation.  TG-24  recommends  that  the  database 
standards  actions  described  above  apply  to  data  base  facili- 
ties provided  by  computer  services,  distributed  systems,  and 
mi  n  i  computers . 


5.2.3  Justification. 

++As  sumpt i  on  s .  The  assumptions  for  these  recommendations 
flow  from  the  recommendations  of  each  of  the  detailed  stan- 
dard recommendations.  TG-24  assumes  no  significant  differ- 
ences will  be  required  for  each  of  the  subject  areas. 

++Benefits  and  Cost  Avoidance.  Applying  the  same  standards 
described  previously  to  the  database  systems  provided  by 
computer  processing  services,  to  distributed  systems,  or  to 
minicomputers  will  enhance  the  database  work  done  in  any  of 
the  other  areas.  It  will  insure  the  broadest  possible  shar- 
ing of  people,  programs,  and  data.  Expanding  the  area  of 
standards  application  will  provide  a  greater  payback  for  the 
costs  expended  in  developing  standards.  Minicomputers,  par- 
ticularly, will  benefit  from  the  central  discipline  exerted 
by  standards  in  view  of  their  anticipated  growth,  diversity 
of  vendors  ,  and  the  lack  of  a  common  operating  system  dis- 
cipline. Distributed  systems,  especially  combining  hetero- 
geneous systems,  will     still     have    the    conversion  problem 


-35- 


between  different  systems  but  eased  somewhat  by  the  common 
standards  shared  by  all   the  different  systems. 

++Co  St  s ,  Cost  associated  with  standard  systems  will  not  be 
d  i  f f e  rent  than  that  of  non-standard  systems.  Conversion 
costs  can  be  reduced  by  proper  timing  of  standards  usage. 


5.3     TRANSITIONAL  STANDARDS  ACTIONS 


5.3.1  Background.  TG-24  recognized  the  timeframe  of  the  many 
actions  i  m  p 1 i  e  d  by  its  recommendations  and  recommends  ac- 
tions for  transitional  periods.  Development  of  standard 
specifications  may  take  as  long  as  two  years  with  another 
year  after  that  for  the  implementation  of  a  standard  pro- 
duct. Standard  products  will  remain  merely  an  assertion  of 
the  vendor  until  validation  procedures  are  available.  What 
actions  should  Federal  agencies  take  until  a  fully  validated 
standard  product  is  available? 

The  question     suggests     the     existence    of    three  time 
pe  r i  od  s : 


1.     Until   a  specification  is  published. 


2.     From  publication  of  the  specification  to  its  commer- 
cial  avai 1 abi 1 i ty  . 


3.     From  commercial   availability  to  validation. 


These  periods  of  time  will  hold  for  each  specification 
resulting  from  these  recommendations.  For  example,  the  net- 
work DML  and  relational  DML  will  have  two  distinct  develop- 
ment paths  which  need  not  necessarily  be  concurrent.  In  ad- 
dition, vendors  may  choose  to  combine  step  2  and  3  by  with- 
holding a  product  until   it  has  been  validated. 


5.3.2  Recommendations.  TG-24  recommends  the  following  posi- 
tions for  Federal  agencies: 


1.  Prior  to  the  development  of  standard  specifications 
for  a  Data  Description  Language,  Federal  agencies 
can  prepare  for  standardization  within  their  own  or- 
ganization     by      using       systems      having      a  Data 
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Description  Language  similar  to  the  CODASYL 
1 anguage. 


2.  In  time  period  1,  select  DBMS  systems  having  in- 
dependent Data  Description  Languages  which  describe 
data  attributes  similar  to  the  CODASYL  specifica- 
tions. 


3.  In  time  period  2,  require  database  systems  under 
procurement  to  meet  the  standard  specification  of 
the  Data  Description  Language  and  those  standard 
Data  Manipulation  Language  specifications  which  are 
appropriate  for  its  application.  For  existing  data- 
base systems,  solicit  translators  from  the  existing 
system  to  the  standard  systems.  Document  existing 
systems  using  the  standard  specifications.  Require 
Data  Dictionary  Systems  and  End  User  Facilities  to 
meet  the  standard  interface  requirements. 


4.     In  time  period  3,  require  all  new    applications  and 
procurements  to  use  the  standard  systems. 


5.4     PRIORITIES  WITHIN  THE  RECOMMENDATIONS 


5. 4. 1  Background .  Technical  interrelations  of  the  recommend- 
ed family  of  database  standards  and  payback  considerations 
argue  for  establishing  priorities  in  the  development  of  the 
recommended  standards.  These  priorities  are  reflected  by 
the  phasing  recommended  below. 


5.4.2  Phasing  Recommendations. 

++Phase  1 .  Develop  specification  for  the  common  attribute 
part  of  the  two  part  data  description  language  standard  as 
described  in  Section  3.3.2. 

++Phase  2.  Develop  specification  of  the  structural  descrip- 
tion for  data  structure  classes  as  described  in  Section 
3  •  3  •  3  • 


-37- 


++Phase  3. 

0    Develop  specification     for    each    data  manipulation 
language. 


0    Develop  guidelines  for  using    the    data  description 
language  specification. 


0     Specify  interface  requirements  for  data  dictionaries 
and  end-user  facilities. 


5.4.3  Justification. 


++Di  scussi  on .  In  preparing  these  phasing  recommendations 
J(j^  notes  that  the  network  category  has  progressed  further 
through  these  phases  than  the  others.  However,  the  recom- 
mendations are  stated  to  avoid  implying  a  preference  for,  or 
the  inherent  superiority  of,    the  network  category. 
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7.  APPENDICES 


7.1     APPENDIX  1  -  FIPS  TASK  GROUP  24  CHARTER 


.DATABASE  MANAGEMENT  SYSTEMS  (DBMS) 


++Purpose .  To  make  recommendations  within  one  year  to  NBS 
and  the  FIPS  Coordinating  and  Advisory  Committee  on  the  need 
for  Federal  database  management  system  standards  and  any  ap- 
propriate standards  activities  to  meet  such  needs. 

++Scope .  The  task  group  will  consider  all  areas  within 
current  data  base  technology  with  a  view  towards  defining 
relationships  between  DBMS  and  existing  FIPS  activities  and 
a  more  precise  statement  of  the  scope  for  recommended  stan- 
dards activities.  The  area  of  study  will  include:  distri- 
buted processing  and  databases,  networking,  data  description 
languages,  data         manipulation  languages,  data 

d i ct i onary/ d i rectory  functions,  DBMS  support  functions,  and 
the  role  of  mini-micro  computers  in  DBMS. 

++Program  of  Work . 

1.  Determine  the  Federal  need  for  DBMS  standards.  Con- 
sider Federal  DBMS  standards  in  such  technical  areas 
as  DDL-DML,  minicomputers  and  networking. 


2.  Survey  existing  DBMS  models,  determine  the  relative 
merits  to  the  Federal  agencies  of  the  various 
models,  and  collect  formal  specifications. 

3.  Prioritize  the  various  model  specifications  for  de- 
tailed St  udy . 


For  such  model  specification  selected  for  detailed 
study,  survey  existing  implementations  and  review  to 
what  extent  each  model  specification  feature  was,  in 
fact,  implemented. 

For  each  model  specification  selected  for  detailed 
study,  make  a  formal  recommendation  for  Federal  ac- 
tion. 
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7.2    APPENDIX  2  -  DBMS  USAGE  SURVEY  FOR  TG  24 


BACKGROUND 

An  informal  survey  of  DBMS  usage  was  initiated  at  the 
first  TG  24  meeting  on  March  15,  1977.  All  participants  of 
TG  24  were  asked  to  survey  their  own  agency  in  the  use  of 
DBMS.  The  purpose  of  this  informal  survey  was  to  gather 
preliminary  data  to  substantiate  the  need  for  DBMS  stan- 
dards.    In  particular,  the  survey  attempted  to  find  out: 

0     Whether  Federal  agencies  are  heavily  committed 
to  the  use  of  DBMS. 

0     If  DBMS  are  used,  how  are  they  being  used. 

0    Whether  TG  24  should  conduct  a  formal  survey 
on  DBMS  usage  in  order  to  assess  the  effect  of 
a  DBMS  standard  with  respect  to  current  operations. 

The  requested  information  for  the  informal   survey  follows: 

0  Names  of  DBMS    used  within  the  agency.     For  each 
DBMS  the  following  should  be  addressed: 

0  Type  of  application 

0  Type  of  usage 

0  Kind  of  user 

0  Type  of  support  of  central  facility 

0  Data  interchange 

It  was  also  agreed  that  a  software  system  is  a  Data  Base 
Management  System  if  it  possesses  all  of  the  following  pro- 
pert  i  es : 

0     It  is  an  integrated  set  of  computer  procedures 

0     It  facilitates  usage  of  data 

0     It  facilitates  storage  and  maintenance  of 

large  amounts  of  data 
0     It  provides  frequently  used  shared  functions 
0     It  potentially  serves  many  functional  purposes. 
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SURVEY  RESULTS 

Su r vey  Population 


14  Agencies     responded     with     the     DBMS     usage  survey. 
These  are  as  follows: 

1.  Agriculture 

2 .  Air  Fo rce 

3.  Army 

4.  Central   Intelligence  Agency 

5.  Defense  Intelligence  Agency 

6.  Federal   Energy  Administration 

7.  Federal   Home  Loan  Bank  Board 

8.  General   Services  Administration 

9.  Health,  Education,  and  Welfare 

10.  National   Bureau  of  Standards 

11.  National   Security  Agency 

12.  Office  of  Management  and  Budget 

13.  Pension  Benefit  Guarantee  Corporation 

14.  Veterans  Administration 

Detailed  data  collected  from  each  Agency     is  tabulated 
in  Table  1.     Summary  data  is  presented  below: 

Numbers  of  DBMS  In  Use  Within  Agencies 

1  agency  (FHLBB)   has  no  DBMS  at  present. 
1  agency  (NSA)  has  14  DBMS  in  use. 


The  remaining  agencies  have  1  or  more  DBMS.  The  conclusion 
from  this  fact  is  that  most  of  the  agencies  are  using  DBMS 
and  some  agencies  use  more  than  one  DBMS. 

In-House  Written  Versus  Commerc  i  al  Packages 

Among  57  distinct  DBMS  in  use  in  the  survey  population, 
17  were  in-house  written.  6  were  developed  by  the  Air 
Force,  5  were  developed  by  NSA.  Roughly  75%  of  DBMS  in  use 
were  commercial  package. 

Characteristics  of  In-House  Written  DBMS 

Two  in-house  written  systems  are  used  for  document  re- 
trieval. The  remaining  in-house  written  DBMS  are  built  for 
intelligence  military  applications.  Real-time  processing 
with  some  on-line  and  batch  usage  is  typical   of  the  in-house 
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written  systems. 

Commerc  i  a1   DBMS  Packages  Pi  stribution 

(a)  Host-Language  Type  systems  -  These  include  Codasyl- 
like  systems  but  not  necessarily  Codasyl  syntax.  Systems 
include  DMS-1100,  TOTAL,  DMS-II  (Burroughs),  DBMS-10, 
DBMS-11,  IDMS.  7  agencies  (Army,  VA ,  Air  Force,  HEW/SSA, 
0MB,  NBS,  NSA)  have  these  systems. 

(b)  System  2000  -  4  agencies  (Army,  Air  Force,  HEW/NIH, 
Agriculture)   use  this  system. 

(c)  IMS  -    2  agencies  (Air  Force,  DHEW)   use  this  system. 

(d)  Report  Writers  -  2  report  writers  (Mark  IV, 
Easytrieve)  are  used  by  2  agencies  (Agriculture,  NIH). 


(e)      Bibliographic  systems 
(CAIN,     BASIS,  2  in-house  written) 
popul at  ion. 


4  bibliographic  systems 
are  used  among  the  survey 


(f)  Hardware  supporting  only  one  DBMS  system  -  2  DBMS  can 
be  classified  as  the  sole  DBMS  operational  for  the  hardware: 
IMAGE  3000  on  Hewlett  Packard,  and  DMS-II  on  Burroughs. 

(g)  Time-Sharing  Service  -  3  agencies  (Army,  GSA,  Agri- 
culture) use  the  CSC  Infonet  Time-sharing  service  which  has 
System  2000,  DML  and  Aladin.  The  Pension  Benefit  Guarantee 
Corporation  uses  Compu-Serv  time-sharing  service  which  has 
System  1022. 

(h)  Early  generation  DBMS  systems  -  5  early  generation 
systems  are  used  among  the  survey  population.  These  are 
FFS,  NIPS,  IDS-I,  MARS  and  MARK  III.  Users  were  system  pro- 
grammers and  the  usage  is  batch  for  report  generation  pur- 
poses. 

(i)  Research  systems  -  1  system  INGRESS  which  is  a  rela- 
tional DBMS  developed  by  University  of  California,  Berkeley, 
is  used  by  2  agencies  (NBS,  NSA).  Usage  is  experimental 
research  oriented  and  no  production  applications  have  been 
developed.  The  Army  is  using  UNIBASE  written  in  ANS  Cobol 
74  for  research  in  DBMS  portability. 

Usage  and  Users 

(a)  For  Host-language  and  CODASYL-like  systems,  the  usage 
is  predominantly  batch,  transaction-oriented  and  used  mostly 
to  produce  reports.  Users  were  either  subject-matter  spe- 
cialist or  clerks  for  invoking  a  pre-defined  transaction  and 
system  programmers  for  coding  transactions. 
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(b)  For  systems  such  as  System  2000,  ADABAS,  1022,  GIM- 
II,  usage  is  predominantly  on-line  querying  and  updating. 

(c)  The  majority  of  the  users  of  DBMS  are  reported  to  be 
subject-matter  specialists  with  programmers  doing  the  system 
interface  type  of  task. 

(d)  Agriculture  has  an  experience  with  self-imposed  stan- 
dards. System  2000  at  Agriculture  has  about  28  applica- 
tions. Some  are  small  applications  with  less  than  100 
records.  The  other  DBMS  in  use  at  Agriculture  are  IM- 
AGE-3000,  Infonet  DML  and  in-house  written  ones.  Forestry 
uses  FS-GIM.  This  might  be  due  to  the  agency  imposed  stan- 
dard DBMS  practice  to  define  all  applications  using  one 
DBMS. 

(e)  In  two  cases  (Army  and  NSA),  DBMS-11  is  being  used  ex- 
perimentally as  a  back-end  to  a  host  computer  370/158. 
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7.3     APPENDIX  3  -  DBMS  TERMINOLOGY  FOR  TG-24 


ACCESS  -  The  operation  of  seeking,  reading,  or  writing  data 
on  a  storage  unit.     [MART  75] 

ACCESS  CONTROL  -  The  mechanism  in  a  data  base  system  which 
provides  protection  of  the  data  base  from  both  inten- 
tional  destruction  and  improper  disclosure.  [DATE  75] 

ACCESS  METHOD  -  A  technique  for  moving  data  between  a  com- 
puter and  its  peripheral  devices,  e.g.  serial  access, 
random  access,  remote  access,  virtual  sequential  access 
method  (VSAM),  hierarchical  indexed  sequential  access 
method   (HISAM)  .     [MART  75] 

AUDIT  TRAIL  -  The  process  of  keeping  of  records  in  both 
events  and  data  activities  within  a  system  to  allow  fu- 
ture examination  and  verification  of  what  has  tran- 
spired. [CODA  76] 


AUTHENTICATION  -  The  process  of  supplying  information  known 
only  to  the  person  the  user  has  claimed  to  be.  [DATE 
75] 

AUTHORIZATION  -  The  definition  to  the  system  of  the  opera- 
tions each  user  is  allowed  to  perform.     [DATE  75] 

BACK-UP  -  The  process  of  generating  a  copy  of  a  data  base  at 
some  point  in  time.     [CODA  76] 

CATEGORIES  -  Categories  are  groupings  of  database  management 
systems  based  on  some  technical  criteria. 

DATA  ATTRIBUTE  -  A  known  characteristic  of  a  data  item, 
(e.g.,  a  numeric  field,  a  date  field,  etc.) 

DATABASE  -  A  collection  of  interrelated  data  items  process- 
able  by  one  or  more  programs. 

DATABASE  ADMINISTRATOR  (DBA)  -  A  person  or  persons  given 
the  responsibility  for  the  definition,  organization, 
protection  and  efficiency  of  the  database  for  an  enter- 
prise. 

DATA  DICTIONARY  (DD)  -  A  software  tool  that  is  used  to  con- 
trol the  totality  of  data  elements  within  an  applica- 
tion. It  is  the  central  repository  of  all  descriptive 
information  about  each  data  element. 
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DATA  DESCRIPTION  LANGUAGE  (DDL)  -  The  language  used  to 
describe  the  database,  or  that  part  of  the  database 
known  to  a  program. 

DATA  ELEMENT  -    A  named  collection  of  one  or  more  items. 

DATA  INDEPENDENCE  -  The  property  of  being  able  to  change  the 
overall  logical  or  physical  structure  of  the  data 
without  changing  the  application  program's  view  of  the 
data.     [MART  75] 

DATA  ITEM  -  The  smallest  unit  of  named  data  containing  no 
sub- structure ;  the  lowest  level  of  addressable  data  in 
which  data  value(s)  are  physically  stored. 

DATABASE  MANAGEMENT  SYSTEM  -  A  software  system  is  a  data- 
base management  system  if  it  possesses  all  of  the  fol- 
lowing properties: 

It  is  an  integrated  set  of  computer  procedures 

It  facilitates  usage  of  data 

It  facilitates  storage  and  maintenance  of  large 
amounts  of  data 

It  provides  frequently  used  shared  functions 
It  potentially  serves  many  functional  purposes. 

DATA  MANIPULATION  LANGUAGE  (DML)  -  The  language  used  to 
cause  data  to  be  transferred  between  the  application 
program  and  the  database. 

DATA  MODEL  -  See  Data  Structure. 

DATABASE  POPULATION  -  The  process  of  placing  occurrences  of 
data  values  into  a  defined  data  base.     [CODA  76] 

DATA  SECURITY  -  The  protection  of  the  data  in  the  data  base 
against  unauthorized  disclosure,  alteration,  or  des- 
truction.    [DATE  75] 

DATA  STORAGE  DESCRIPTION  LANGUAGE  (DSDL)  -  A  language  to  de- 
fine the  representation  of  a  data  base  in  storage. 
This  term  is  used  in  [CODA  78]. 

DATA  STRUCTURE  -  The  logical  relationships  which  exist 
among  the  units  of  data  in  a  database  and  which  are 
under  control  of  a  database  management  system. 

DATA  VOLATILITY  -  Refers  to  the  rate  of  change  of  the  values 
of  stored  data.  [CODA  76] 

DEVICE  MEDIA  CONTROL  LANGUAGE  (DMCL)  -  A  language  used  to 
map  the  data  onto  physical  storage  media.  It  describes 
the  physical   location  and  organization  of  the  data. 
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DISTRIBUTED  DATABASE  -  A  database  under  the  overall  control 
of  a  central  database  management  system,  but  where  the 
devices  on  which  it  is  stored  are  not  all  attached  to 
the  same  processor. 

END  USER  -  The  end  users  are  persons  who  perform  the  appli- 
cation functions.  End  users  include  parametric  users 
and  generalized  function  users,  but  they  are  not  system 
support  personnel.  [CODA  76] 

EXTRACTION  -  The  process  of  obtaining  data  values  on  each 
data  structure  level  and  identifying  the  disposition  of 
these  values  to  an  output  media.     [CODA  76] 

FILE  -  A  collection  of  all  occurrences  of  a  given  type  of 
logical  record. 

HIERARCHY  -  A  set  of  directed  relationships  between  two  or 
more  units  of  data,  such  that  some  units  are  considered 
owners  while  others  are  members.  This  is  distinguished 
from  a  network  in  that  in  a  hierarchy,  each  member  can 
have  one  and  only  one  owner. 

HOST  LANGUAGE  SYSTEM  -  A  database  management  system  that  is 
built  upon  the  facilities  of  a  programming  language  and 
is  identified  to  the  application  programmer  for  logical 
and  physical  file  manipulations.  The  tools  are  embed- 
ded in  the  host  language  (e.g.,  COBOL,  PL/1)  and  are 
accessed  usually  through  CALL  statements,  but  sometimes 
by  extensions  in  the  language. 

INDEX  -  A  table  used  to  determine  the  location  of  a  record. 
[MART  75] 

INTERROGATION  -  The  function  of  data  selection  and  qualifi- 
cation, extraction,  manipulation,  and  result  presenta- 
tion.    [CODA  76] 

INVERTED  FILE  -  A  file  structure  which  permits  fast  spon- 
taneous searching  for  previously  unspecified  informa- 
tion. Independent  lists  or  indices  are  maintained  in 
records  keys  which  are  accessible  according  to  the 
values  of  specific  fields.     [MART  77] 

LOGGING  -  The  recording  of  actual  changes  to  a  data  base  as 
updating  occurs.  [CODA  76] 

NETWORK  -  A  set  of  directed  relationships  between  owner  and 
members  such  that  a  single  member  record  may  belong  to 
one  or  more  owner  relationships. 

QUALIFICATION  -    The     specification    of    selection  criteria 
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during  the  formulation  of  a  query,  [CODA  76] 
QUERY  -  same  as  interrogation. 

QUERY  LANGUAGE  -  Language  which  permits  spontaneous  queries 
to  be  entered,  or  allows  a  non-programmer  user  to  ex- 
plore the  data  base  and  produce  reports  embodying  its 
data.     [MART  75] 

PARAMETRIC  USERS  -  Parametric  users  deal  with  specific  ele- 
ments of  data  according  to  a  predetermined  procedure 
and  using  a  limited  complement  of  commands.  They  are 
basically  unconcerned  with  data  structure  and  flow  ex- 
cept in  terms  of  the  relation  of  data  items  to  identif- 
iers. [CODA  76] 

PRIVACY  KEY  -  A  facility  specified  as  a  literal,  a  data 
item,  or  a  procedure  used  to  unlock  an  operation  which 
is  locked.     [DATE  75] 

PRIVACY  LOCK  -  A  facility  specified  as  a  literal,  a  data 
item,  or  a  procedure  used  to  prevent  an  operation  from 
proceeding  unless  the  matching  privacy  key  is  present- 
ed. [DATE  75] 

RECORD  (logical)  -  A  collection  of  related  values  treated 
as  a  logical  unit  during  any  operation  of  the  database 
management  system  (e.g.,  during  data  collection,  pro- 
cessing, or  output). 

RECORD  (physical)  -  A  unit  of  data  to  be  placed  on,  or  tak- 
en from,  a  storage  device  in  a  single  operation. 

RECOVERY  -  The  regaining  or  the  bringing  back  to  an  original 
position  or  condition.  [CODA  76] 

RELATIONAL  -  A  way  of  modelling  data  structures  by  rela- 
tions. Relations  are  usually  represented  as  a  collec- 
tion of  tables  where  each  table  contains  the  oc- 
currences of  a  particular  relation.  Each  column  of  the 
relation  corresponds  to  an  attribute  and  each  row  is  an 
instance  of  the  relation. 

SCHEMA  (conceptual  schema)  -  A  complete  description  of  the 
database  in  terms  of  the  characteristics  of  the  data 
and  the  implicit  and  explicit  relationship  between 
data-units. 


SELF-CONTAINED  SYSTEM  -  A  database  management  system,  the 
capabilities  and  language  of  which  are  intended  pri- 
marily for  the  non-programmer.  They  are  self-contained 
in     the  sense  that  they  usually  have  no  connection  with 
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any  procedural  language  except  that  the  system  itself 
may  be  written  in  a  procedural  language,  or  it  may  per- 
mit user-written  code  in  a  procedural  language. 


SUBSCHEMA  (or  external  schema)  -  A  description  of  those 
data-units  and  relationships  from  a  database  of  in- 
terest to  a  particular  program. 

UPDATE  -  The  process  of  changing  values  in  all  or  selected 
entries,  groups,  or  data  items  stored  in  a  data  base  or 
adding  or  deleting  data  occurrences.  [CODA  76] 
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APPENDIX  4  -  DATA  DESCRIPTION  LANGUAGE 


WORKPLAN 
Ob  j  ec  t  i  V  es 


0  Recommend  and  justify  need  for  Standards  in  the  Data 
Description  Language  area  based  on  Federal  govern- 
ment user  needs. 

0  Develop  plan  of  standards  activities  to  meet  such 
needs . 

Methodology 


0  Define  subcommittee  scope  and  objectives 

0  Develop  definition  of  DDL 

0  Define  products  of  study 

0  Define  tasks  required 

0  Devel op  work  pi  an 

0  Impl ement  plan 

0  Analyze  findings 

0  Produce  recommendations 
Ta  sks 


0    Define  "Data  Description  Language" 

++Review  existing  definitions 
++Review  CODASYL  proposal 
++Define  role  of  DMCL 
++P reduce  DDL  definition 


0     Review  existing  DDL     implementations     for  different 
data  models 

++Categorize  attribute  definitions 
++Categorize  structure  definitions 
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0    Summarize  and  analyze  results  of  previous  tasks 

++Review  work  of  other  DBMS  standards  groups 
4.     Develop  conclusions  and  recommendations 

++Review  of  Data  Descri  pt  i on  Languages*  A  1 imited  number  of 
DDI  definitions  were  examined.  The  purpose  of  the  review 
was  to  develop  a  working  definition  that  reflected  the  scope 
of  the  subcommittee  area  of  responsibility. 

The  definitions  reviewed  were: 

1.  Principles  of  Data  Base  Management  -  J.  Martin 

2.  CODASYL  -    Data     Description    Language,     Journal  of 
Development,  June  1973 

3.  Data  Base  System,  A  Practical   Reference  -  Ian  Palmer 

4.  TG-24  Data  Base  Management  System  Terminology 

DDL  DEFINITION 

A  Data  Description  Language  (DDL)  is  a  stand-alone  language 
that  defines: 

1.  Attributes  of  data  elements 

2.  Logical   structure  of  the  data  base 

3.  Logical   relationships  among  units  of  data  (records) 

4.  Logical  methods  of  access  to  data 
THE  DDL  DEFINITION  DOES  NOT  INCLUDE  DMCL 

The  Device  Media  Control  La nguage{DMCL )  is  used  to  map  the 
data  onto  physical  storage  media.  It  describes  the  physical 
location  and  physical  organization  of  the  data.  As  such,  it 
is  the  bridge  between  the  DDL  and  the  host  DBMS  or  computer. 

SEPARATION  OF  DDL   INTO  DATA  ATTRIBUTES  AND  DATA  STRUCTURES 
Attribute  Definition: 
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1.  Identifies  type     of    data    category    such    as  item, 
record ,  file 

2.  Names  items,  records,  files 

3.  Specifies  sequence  of  items  in  records 

4.  Defines  method  of  data     encoding     (numeric,  binary, 
ASCII,  etc.) 

5.  Defines  length  of  data  items 

6.  Identifies  repeating  groups  of  items 

7.  Validation  or  range  values  for  data  items 
Structure  Definition: 


0  Identifies  key  data  items 

0  Specifies  which  records  are  related 

0  Specifies  how  records  are  related 

0  Names  data  relationships 

0  Specifies  sequence  of  records,  and  files 

0  Specifies  method  that  will  be  used  to  access  data 

EVALUATION  £F  EXISTING  IMPLEMENTATIONS 

Network,  hierarchical  and  relational  DBMS  implementa- 
tions were  analyzed  for  adaptability  to  DDL  standardization. 
The  necessity  for  different  methods  of  describing  data  at- 
tributes and  structure  was  of  special  concern. 

The  CODASYL  specification  of  COBOL  type  data  attributes 
descriptions  was  adequate,  easy  to  use,  and  widely  known. 
Other  methods  of  defining  attributes  gained  nothing  for  the 
price  of  their  being  different. 

Specification  of  file  size  or  placement  of  data  into 
physical  storage  units  is  dependent  on  hardware.  Each  DBMS 
implementation  is  designed  to  its  host  computer.  This  area 
was  not  considered  a  candidate  for  standardization,  and  re- 
moved from  our  definition  of  DDL. 
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Data  structures,  and  associated  access  facilities,  were 
the  most  divergent  areas.  Each  structure  had  its  own  advan- 
tages, depending  on  an  application  using  the  data  base. 
What  is  good  for  retrieval  may  not  be  good  for  update;  what 
is  good  in  batch,  may  not  be  good  on-line. 

No  single  structure  was  best  for  all  applications. 
However,  the  thr^e  structures  together  seem  to  meet  the 
needs  of  most  applications. 
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7.5     APPENDIX  5  -  DATA  MANIPULATION  LANGUAGE 


INTRODUCTION 

What  is  a  data  base  management  system?  What  functions 
does  it  provide?  These  questions  must  be  answered  before  a 
standard  Data  Management  Lan-guage  (DML)  model  can  be 
developed.  After  all  the  functions  provided  by  the  set  of 
software  identified  as  data  base  management  systems  (DBMS) 
are  qualified,  then  the  question  "what  can  be  standardized?" 
can  be  answered.  The  amount  of  effort  required  to  define 
DBMS  and  survey  the  current  market  of  available  DBMS  will  be 
considerable.  However,  it  is  recommended  that  data  base 
management  be  defined  concisely  and  that  every  function  pro- 
vided by  every  system  that  meets  the  definition  be  identi- 
fied. The  data  management  function  (DML)  subcommittee  of 
Task  Group  24  (DATABASE  SYSTEMS  STANDARDS)  has  chosen  a  sub- 
set of  available  DBMS  to  illustrate  how  the  standards  effort 
could  be  applied  to  data  base  management  language  functions. 

Twelve  data  management  systems  were  arbitrarily  select- 
ed for  functional  analysis.  This  subset  will  illustrate  op- 
tional approaches  that  are  possible  with  a  given  set  of  can- 
didate DBMS.  This  paper  only  considers  the  standardization 
of  host  dependent  data  management  languages  of  the  candidate 
DBMS.  The  optional  approaches  leading  to  a  standard  DML  for 
data  base  management  systems  will  be  treated.  The  recom- 
mended approach  will  be  emphasized.  The  following  sections 
of  this  appendix  will  cover  the  approach  that  was  taken  to 
arrive  at  the  recommendations  presented  in  part  II  of  this 
DML  subsection  of  this  report. 

METHOD  AND  SCOPE 

There  are  many  types  of  data  management  systems.  Most 
of  them  can  be  classified  as  one  of  the  following  types: 


1.  Data  retrieval  systems 

2.  File  management  systems 

3.  Complex  file  systems 

4.  Teleprocessing  monitors 

5.  Data  base  management  systems 
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6.  On-line  data  management  systems 

7.  Special   purpose  systems 


There  are  subtle  and  marked  differences  as  well  as  overlap- 
ping capabilities  among  the  types  of  data  management  sys- 
tems. But,  each  available  data  system  can  be  categorized 
into  one  of  these  types.  TG-24  has  been  chartered  to  inves- 
tigate standardization  of  data  base  management  systems.  The 
DML  subcommittee  is  limited  to  evaluating  the  possible  op- 
tions open  to  standardization  of  the  data  management  func- 
tions that  require  the  support  of  a  host  programming 
language,  such  as  Fortran,  Cobol  ,  etc.  It  has  been  found 
that  considering  only  the  DML'  in  standardizing  DBMS  is  not 
practical.  The  DML  is  entwined  with  the  rest  of  the  data 
base  management  parts:  the  DDL,  the  DMCL,  and  the  host  com- 
puter operating  system.  So,  the  first  step  the  DML  subcom- 
mittee decided  to  take  was  to  collect  all  the  data  manage- 
ment systems  that  fit  the  type  "data  base  management  sys- 
tems." This  seems  simple  enough,  except  that  there  are  no 
formal  well  accepted  definitions  for  any  of  the  above  types 
of  data  management  systems  especially  for  DBMS.  TG-24  had 
two  problems.  First,  there  were  no  acceptable  definitions. 
Second,  there  are  a  large  number  of  data  management  systems 
available  that  need  to  be  measured  against  a  definition. 

The  DML  subcommittee  of  TG-24  felt  it  necessary  to  lim- 
it the  field  in  order  to  develop  a  standard  model  DML  con- 
taining a  representative  set  of  the  data  management  func- 
tions provided  by  the  set  of  existing  DBMS.  In  order  to  ac- 
complish this  subsetting  to  one  level,  a  definition  of  a 
data  base  management  system  in  terms  of  major  functions  pro- 
vided was  proposed.  This  definition  is  not  intended  to  be 
the  final  acceptable  definition  of  data  base  management  sys- 
tems. Its  purpose  is  to  limit  .the  available  DBMS  candidates 
for  the  first  phase  standardization  effort.  A  broader  de- 
finition that  will  include  all  data  management  systems  in 
the  standards  effort  is  recommended  after  a  firm  approach  is 
selected  using  a  smaller  set  of  DBMS.  Also,  for  an  accept- 
able definition  to  evolve,  a  preliminary  one  must  be  provid- 
ed as  a  base.  The  definition  for  a  data  base  management 
system  in  terms  of  functions  and  facilities  provided  is  as 
foil ows : 


1.     It  is  an  integrated  sharable  set    of    computer  pro- 
grams that  perform  data  access  functions. 
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2.  It  separates  data  access  (DML)  and  data  definition 
(DDL)  from  application  programs  in  a  uniform  manner. 

3.  It  facilitates  and  controls  in  a  uniform  manner 
storage  and  retrieval  of  data  in  and  from  a  data 
base  using  random  access  storage  devices. 

4.  It  is  generalized  in  the  areas  of  file  structure, 
the  number  and  types  of  physical  files  supported, 
and  in  number  and  type  of  applications  supported. 

5.  It  supports  and  maintains  the  integrity  of  the  data; 
validity  checking  and  file  backup  and  recovery  util- 
ities. 

Now  that  a  definition  exists,  a  survey  of  the  market  can  be- 
gin, and  all  the  data  management  systems  that  meet  the  de- 
finition can  be  selected  as  candidate  DBMS  for  the  DML  stan- 
dardization evaluation.  It  was  found  that  the  number  of 
data  management  systems  that  need  to  be  surveyed  was  too 
large  for  TG-24  to  handle  in  the  one  year  time  frame  given 
to  our  initial  study  of  DBMS  standardization  feasibility. 
The  DML  subcommittee  chose  to  select  a  small  subset  of  sys- 
tems that  are  commonly  accepted  as  DBMS  to  illustrate  the 
options  open  to  DML  standardization. 

The  DML  subcommittee  chose  twelve  DBMS  for  evaluation. 
The  twelve  systems  are  as  follows: 


1.  COMPUTER  CORPORATION  OF  AMERICA  -  MODEL  204 

2.  DIGITAL  EQUIPMENT  CORPORATION  -  DBMS/10 

3.  CULLINANE  -  IDMS 

4.  HONEYWELL  INFORMATION  SYSTEMS  -  IDS/II 

5.  MRI  -  SYSTEM  2000 

6.  SOFTWARE  HOUSE,   INC.   -  SYSTEM  1022 

7.  UNIVAC  -  DMS/1100 

8.  IBM  -  IMS 

9.  BURROUGHS  -  DMS/II 

10.  UNIVERSITY  OF  CALIFORNIA  -  INGRES 
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11.     GENERAL  MOTORS  RESEARCH  LABORATORY  -  ROMS 


12.     UNIVERSITY  OF  TORONTO  -  TORUS/ZETA 


Using  the  twelve  DBMS  selected,  a  chart  was  built  showing 
the  major  DML  commands  of  each  DBMS.  See  Chart  I.  Each  of 
the  DBMS  has  other  commands  for  control  and  interface  with 
respective  operating  systems  that  were  omitted  for  this 
illustration.  Examination  of  Chart  I  will  show  that  at 
least  four  possible  categories  of  command/function  sets  can 
be  identified  by  selecting  common  command  subsets.  Also,  if 
the  data  structure  models  were  made  known,  the  categories 
could  be  identified  by  data  structure.  This  is  not  surpris- 
ing because  the  DML  commands  directly  relate  to  the  data 
structure  model  of  a  given  DBMS.  The  four  categories  that 
can  be  identified  are  network,  hierarchical,  relational,  and 
table  structure.  There  may  be  more  possible  categories. 
The  area  of  DBMS  classification  and  definition  is  complex 
and  controversial.  To  resolve  this  area  satisfactorily  may 
require  a  separate  committee  and  a  year  or  two's  time.  A 
network  system  is  an  extension  of  a  hierarchical  system.  It 
accommodates  hierarchy  but  also  allows  complex  networks. 
Most  CODASYL  DBMS  fall  into  this  category.  A  hierarchical 
system  accommodates  data  hierarchy  in  the  data  structure 
model,  and  supplies  commands  to  manipulate  hierarchical 
data.  The  relational  system  defines  a  range  of  values 
across  a  domain  as  a  relation.  Relations  are  defined  as  tu- 
ples involving  one  attribute  and  many  values.  Also,  the 
creation  and  destruction  of  relations  is  dynamic.  The 
self-contained  system  is  one  that  does  not  need  another 
language  to  act  as  a  host.  However,  many  self-contained 
systems  offer  host  language  capability.  The  difference  is 
in  the  physical  data  structure;  a  self-contained  system  usu- 
ally has  a  static  physical  data  structure,  such  as  a  table 
structure.  Regardless  of  the  logical  structure  of  the  data 
base,  the  physical  structure  is  the  same  in  most  self- 
contained  systems  . 

DBMS  designs  have  different  purposes;  i.e.,  relational 
systems  allow  dynamic  changes  to  be  made  to  sets  and  rela- 
tionships to  handle  a  moderate  amount  of  volatile  data.  The 
CODASYL  approach  allows  very  flexible  data  structure  for 
large  amounts  of  data,  but  the  structure  is  very  static  once 
defined.  The  command  sets  that  accompany  the  different 
types  of  data  base  management  systems  are  directly  related 
to  the  data  models  of  the  particular  type.  This  is  the  rea- 
soning behind  categories  of  data  base  management  systems 
that  are  illustrated  in  Charts  A,  B,  C,  and  D. 
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Both  the  common  and  unique  commands  are  illustrated  in 
Charts  A,  B,  C,  and  D.  The  DBMS  were  grouped  by  evaluating 
common  commands,  unique  commands,  and  data  models.  The 
names  given  to  the  resulting  categories  are  not  important, 
and  this  is  another  area  that  could  be  standardized.  Some 
systems  could  fit  into  more  than  one  category,  such  as  Sys- 
tem 2000.  System  2000  is  a  table  structure  system,  but  it 
is  also  a  hierarchical  system.  It  was  included  in  both 
Charts  B  and  D  to  illustrate  dual  category  possibilities. 
This  also  provides  encouragement  for  the  possibilities  of 
DBMS  evolving  to  provide  similar  command/function  capabili- 
ties. Network  and  hierarchical  systems  are  similar  enough 
to  evolve  to  one  category  in  the  near  future.  There  is 
probably  one  other  category  that  will  result  if  a  more  com- 
plete comparison  and  classification  effort  is  undertaken. 
It  will  be  the  special  purpose  data  base  management  system 
category.  In  some  cases,  sem i - genera  1 i zed  data  base  manage- 
ment systems  are  developed  for  special  purpose  hardware 
and/or  applications.  Due  to  the  special  purpose  nature  of 
such  a  DBMS,  it  would  be  very  difficult  to  standardize.  One 
special  purpose  system  is  usually  not  very  similar  to  anoth- 
er. So,  this  category  would  hold  a  group  of  DBMs  that  are 
dissimilar  to  themselves  and  the  other  categories  and  not 
amenable  to  standardization. 

PROCESS  LEADING  10  DML  RECOMMENDATIONS 

Now  that  the  DBMS  are  categorized  into  subsets,  some 
options  for  standards  can  be  proposed.  Three  or  four  op- 
tions are  possible  today.  First,  a  standard  DML  interface 
language  for  each  category  could  be  developed.  Second,  an 
interface  language  could  be  developed  to  include  all  identi- 
fied categories;  one  standard  interface  language  with 
preprocessors  to  generate  individual  DML's.  Third,  a  set  of 
standard  DML  commands  could  be  developed  for  each  category 
identified  from  the  analysis  of  all  candidate  DBMS.  Of 
course,  the  option  of  no  recommended  standards  at  this  time 
is  a  possibility.  The  possibility  of  selecting  an  existing 
DBMS  from  a  standard  appears  unlikely  because  no  single  DBMS 
will  have  a  super  set  of  commands/functions  that  would  be 
needed  to  provide  all  the  capability  now  available  by  use  of 
a  combination  of  available  DBMS. 

1 .     Multiple  I n terf ace  Option 

Most  competitive  DBMS  will  have  the  same  major  capabil- 
ities within  the  next  five  years.  Of  course,  the  syntax  and 
semantics  of  the  interface  languages  (host  language  inter- 
faces) will  not  be  standard  unless  reasonable  standards  are 
proposed  and  accepted.  The  standard  interface  approach  for 
categories  of  DBMS  provides  a  flexible  method  for  gaining 
applications  program  DBM  standardization.     When,  and  as  DBMS 
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begins  to  provide  the  same  functions  and  capabilities,  the 
interface(s)  can  be  modified  to  accommodate  changes  or  addi- 
tions to  the  standard  DMLS.  Eventually,  a  single  standard 
DML  interface  may  evolve  if  the  DBMS  becomes  similar  enough. 
Vendors  of  DBMS  are  already  providing  similar  capabilities 
to  remain  competitive.  The  users  should  begin  to  be  con- 
cerned with  standard  methods  of  identifying  capabilities  be- 
fore the  vendors  implement  new  capabilities. 

2 •     Single  Interface  Opt i  on 

The  option  of  beginning  with  a  single  standard  DML  in- 
terface may  be  possible,  but  it  would  be  very  complex  and 
may  not  be  acceptable  to  the  end  user.  It  would  be  neces- 
sary for  the  single  interface  to  accommodate  the  full  set  of 
DML  commands  of  all  the  candidate  DBMS  selected.  This  would 
result  in  command  translations  that  would  cause  no  opera- 
tions in  any  DBMS  that  do  not  provide  the  command  capability 
desired.  It  is  recommended  that  if  the  interface  option  to 
standardization  is  selected,  the  first  option  chosen  should 
be  one  standard  interface  for  each  identified  category.  If 
this  approach  is  followed,  then,  as  the  DBMS  categories  pro- 
vide the  same  command/functions,  one  standard  interface  may 
evolve.  One  advantage  to  interface  approach  is  that  the 
user  can  implement  or  have  the  interface  implemented  without 
cooperation  from  the  vendor  of  any  or  all  the  candidate 
DBMS.  However,  vendor  cooperation  is  desired  if  possible. 
It  is  also  recommended  that  the  feasibility  of  developing  a 
single  DML  standard  for  all  host  programming  languages  for  a 
given  category  of  DBMS  be  evaluated. 

3,     Standard  DML  Language  Option 

A  set  of  DML  commands  could  be  specified  for  each  ca- 
tegory of  DBMS.  Again,  it  is  wise  to  stay  within  the  ca- 
tegories to  avoid  translations  that  would  result  in  no 
operations.  Chart  I  gives  an  indication  of  the  difficulties 
that  would  be  faced  in  developing  a  standard  DML  command  set 
for  all  available  DBMS.  This  approach  would  not  provide  the 
user  with  a  working  standard  as  readily  as  the  interface  ap- 
proach, because  after  the  standard  language  is  specified, 
the  individual  vendors  of  each  DBMS  in  each  category  would 
be  responsible  for  implementing  the  standard  DML,  if  indeed 
they  would  be  agreeable  to  do  so.  The  standard  DML  would 
not  insulate  the  user  from  changes  in  individual  DBMS  as 
well  as  in  the  interface  option.  Changes  to  the  DML  of  a 
given  DBMS  could  be  handled  in  the  standard  interface  in- 
stead of  in  each  application  program.  If  changes  were  made 
to  a  standard  DML  system,  each  application  that  would  be  af- 
fected would  require  change.  However,  this  option  would  al- 
low an  application  program  to  access  any  DBMS  of  a  given  ca- 
tegory in  a  computer  network  using  one    DML.      For    a  given 
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category,  see  Charts  A,  B,  G,  and  D,  the  major  DML  commands 
are  already  similar.  The  problem  with  specifying  a  standard 
DML,  is  that  each  vendor  of  each  candidate  DBMS  for  each  ca- 
tegory must  implement  the  standard.  This  is  a  big  problem 
because  of  cost  to  the  vendor  and  the  fact  that  one  or  more 
of  his  major  capabilities  may  not  be  standard.  This  could 
make  vendors  unwilling  to  agree  to  standardize  their  DML. 
Anyway,  it  is  recommended  that  a  standard  DML  set  be  speci- 
fied for  each  category  of  DBMS  for  two  reasons.  One,  by 
specifying  the  DML  set,  the  data  management  functions  for 
each  category  are  identified.  Two,  the  DML  sets  for  each 
category  are  basic  to  whatever  approach  is  taken  to  develop 
a  DML  standard. 

4.     No  Standard  DML 

No  standard  is  always  a  possibility,  but  in  the  case  of 
data  base  management  systems  a  lack  of  standards  is  costly 
and  is  going  to  be  more  costly.  Hardware  advances  in  the 
next  decade  will  probably  provide  more  speed  to  conventional 
DBMS  rather  than  new  DBMS  concepts.  Standardization  of 
current  conventional  DBMS  functions  will  not  only  help  the 
users  within  the  next  decade,  but  will  help  the  designers  of 
future  data  base  management  systems  by  emphasizing  the  major 
DML  functions  through  standardization.  Then  only  new  system 
capabilities  will  need  standards  attention. 

BENEFITS  EXPECTED 

It  would  be  very  difficult  to  place  a  dollar  value  on 
DBMS  standardization  versus  no  standardization.  It  is  a 
field  of  computer  science  that  is  still  in  infant  stages. 
Most  installations  have  only  recently  implemented  their 
first  production  system  using  a  DBMS.  Consequently,  there 
is  little  data  available  for  measuring  conversion  or  train- 
ing costs  in  regards  to  standard  or  non-standard  DBMS.  It 
can  only  be  speculated  to  be  expensive  to  convert  applica- 
tions from  one  DBMS  to  another.  Also,  it  can  be  speculated 
that  some  conversions  may  not  be  necessary  if  the  same  ap- 
plication program  could  access  data  of  two  or  more  DBMS 
without  modifications.  This  implies  benefits  for  standard 
DBMS  that  cannot  be  proven  as  yet. 

An  important  benefit  to  be  gained  from  the  standardiza- 
tion of  DBMS  will  be  related  to  the  heterogeneous  computer 
network  environment  and  distributed  data  bases.  Currently, 
the  hardware  and  software  exists  to  provide  the  communica- 
tions capability  for  a  heterogenous  computer  network.  How- 
ever, data  bases  under  the  control  of  different  DBMS  on  the 
various  computer  nodes  of  the  network  cannot  be  readily  ac- 
cessed in  a  uniform  or  standard  manner.  A  standard  DML  in- 
terface or  a  standard  DML  would  reduce  the     programming  and 
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training  effort  necessary  to  access  data  controlled  by  dif- 
ferent DBMS.  It  would  be  possible  for  a  single  application 
program  to  access  similar  data  under  the  control  of  dif- 
ferent DBMS  without  the  need  to  change  or  rewrite  the  pro- 
gram. Of  course,  other  system  software,  such  as  operating 
system  interfaces,  user  protocols,  etc.  would  also  require 
standardization  to  allow  the  DML  standard  to  be  fully  and 
effectively  used.  Changes  to  the  system  software  could  be 
handled  in  the  interfaces  and  free  applications  programmers 
of  any  concern.  Maintenance  could  be  centrally  handled  by 
updating  and  changing  the  interfaces  a  large  precentage  of 
the  time  instead  of  updating  each  individual  application 
program.  Life  cycle  support  of  applications  systems  could 
be  greatly  reduced. 

DEVELOPMENT  AND   IMPLEMENTATION  FEASIBILITY 


Development  possibilities  definitely  exist  for  the 
standard  interface  approach.  The  best  thing  the  approach 
has  going  for  it  is  the  fact  that  one  vendor  or  large  user 
could  design  and  develop  the  interfaces.  Most  other  ap- 
proaches rely  on  the  individual  vendors  of  each  DBMS  to  im- 
plement a  proposed  standard.  This  is  fine,  but  an  automatic 
method  of  sharing  data  between  heterogenous  systems  is  not 
obtainable  through  a  mult i -vendor  standards  effort.  For  in- 
stance, the  CODASYL  standard  provides  a  standard  specifica- 
tion for  a  data  base  management  system;  however,  two  dif- 
ferent vendor  implementations  are  not  compatible  for  data 
sharing  purposes.  The  languages  are  compatible:  the  DDL 
and  the  DML.  An  application  program  compiled  with  the  DML 
of  one  CODASYL  system  cannot  access  the  data  belonging  to 
another  CODASYL  system  without  being  rewritten  and  recom- 
piled. If  the  application  program  were  to  use  a  standard 
DDL  and  DML  interface,  the  interface  could  handle  the  trans- 
lation to  a  different  data  management  system.  As  previously 
stated,  an  interface  could  be  developed  for  each  category  of 
DBMS  identified.  This  is  recommended  as  a  first  phase  DBMS 
standardization  effort. 

The  proposed  interface  would  consist  of  a  precompiler 
stage  plus  an  execution  time  management  routine.  The 
precompiler  would  generate  a  parametric  call  to  a  generic 
data  management  routine.  At  execution  time,  depending  upon 
which  DBMS  is  to  be  invoked,  a  specific  linkage  could  be 
made.  The  DDL  methodology  would  also  require  a  standard  in- 
terface to  handle  dynamic  linkage  to  specific  data  struc- 
tures. This  is  anticipated  to  be  a  later  phase  of  the  in- 
terface standardization  effort.  The  first  level  will  be  to 
standardize  the  DDL  and  DML  interfaces  for  each  identified 
category  of  DBMS.  An  application  program  written  using  one 
of  the  standard  sets  of  interfaces  would  be  precompiled  gen- 
erating calls  to    a     specific    DBMS.      Once    an  application 
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program  is  compiled,  it  would  be  locked  into  a  specific  data 
base  management  system.  However,  it  would  only  need  to  be 
recompiled  using  a  different  precompiler  to  access  a  dif- 
ferent DBMS  of  the  same  cateogory.  This  approach  would  pro- 
vide the  applications  user  with  a  standard  DML  set  for  each 
category  of  DBMS  in  the  near  future,  and  very  possibly 
evolve  to  one  set  if  the  DBMS  becomes  similar  enough  in  the 
short  range. 

MAINTENANCE 

The  standard  interface  approach  insulates  the  user  from 
vendor  changes  to  the  given  category  of  DBMS  he  is  concerned 
with.  If  a  data  management  c omma nd/ f unc t i on  does  not  exist 
in  a  particular  DBMS,  the  interface  could  emulate  the  func- 
tion or  notify  the  user  that  the  feature  is  not  currently 
available  through  the  given  DBMS  he  is  trying  to  access. 
When,  and  if,  the  function  becomes  available,  the  above  pro- 
cedure could  be  dropped,  or  if  the  function  was  never  avail- 
able in  the  interface,  it  could  be  added.  The  user  would 
not  be  required  to  modify  existing  DBMS  host  applications 
programs  every  time  a  change  occurred  to  a  given  DBMS. 

The  maintenance  functions  would  be  transferred  from  the 
individual  applications  programmers  to  a  central  staff  of 
data  base  system  administrators  (DBSA).  The  time  and  per- 
sonnel costs  could  possibly  be  reduced  depending  upon  the 
number  of  application  areas  supported  by  a  given  data  base 
management  system.  With  a  large  number  of  unique  applica- 
tions supported  by  a  given  DBMS,  a  saving  could  be  realized 
by  centralizing  the  support  rather  than  dispersing  the 
maintenance  activities  over  all  the  application  areas  to  be 
supported.  In  the  case  where  only  one  special  purpose  ap- 
plication is  to  be  supported,  the  savings  would  not  occur. 

CONCLUS IONS 

Standardization  of  the  data  management  function  is  de- 
finitely recommended.  Many  approaches  are  possible,  but  the 
interface  is  probably  the  most  viable  one.  It  does  not  rely 
on  the  efforts  of  each  candidate  DBMS  vendor  to  be  imple- 
mented and  it  guarantees  not  only  syntactic  compatibility 
but  compile  and  executiion  time  compatibility  for  at  least  a 
given  category  of  DBMS.  It  is  not  very  likely  that  indivi- 
dual vendors  of  various  DBMS  will  ever  reach  the  point  where 
they  will  offer  systems  that  could  provide  the  same  level  of 
compatibility  that  a  standard  set  of  interfaces  could. 

Charts  A,  B,  C,  and  D  illustrate  the  elementary 
command/function  sets  that  would  form  the  basis  for  the 
standard  interface  languages  for  the  four  categories  identi- 
fied in  this  paper.     There  may  be  more  categories  identified 
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when  a  complete  survey  is  done  using  a  broader  definition 
for  a  DBMS.  There  will  definitely  be  more  commands  and 
functions  identified  for  inclusion  in  the  standard  interface 
languages  for  each  category  of  DBMS  that  is  identified. 
This  approach  is  not  limited  to  commercially  available  DBMS. 
An  in-house  system  could  be  included  by  building  an  inter- 
face to  it.  Another  sometimes  important  advantage  to  the 
interface  approach  is  that  the  existing  data  bases  and  ap- 
plications can  remain  unchanged  while  new  programs  and  data 
bases  are  designed  and  written. 

The  goals  of  this  proposed  standardization  effort 
should  be  to  provide  the  user  with  a  standard  and  uniform 
method  to  build  and  access  data  bases  using  any  one  of  the 
data  base  management  systems  from  a  given  category.  It 
would  be  ideal  to  be  able  to  provide  only  one  access  or  in- 
terface language  for  all  data  base  management  systems.  How- 
ever, that  is  not  practical  today.  The  team  or  committee 
that  does  the  survey  of  data  base  management  systems  would 
be  wise  to  investigate  the  facilities  and  functions  needed 
by  the  generalized  programming  user  and  specify  the  charac- 
teristics of  such  an  interface  and  measure  the  existing  sys- 
tems against  it.  Also,  in  an  effort  to  develop  one  single 
interface  language  for  DBMS,  it  would  be  wise  to  investigate 
the  correspondence  between  the  relational  calculus  function 
of  relational  DBMS  and  the  steps  of  the  conventional  inquiry 
process  (i.e.,  self-contained,  CODASYL,  hierarchical). 
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7.6    APPENDIX  6  -  DATA  DICTIONARY/DIRECTORY 


++Def inition. 

A  data  element  d i ct i ona ry/ di rectory  (to  be  referred  to 
in  short  as  data  dictionary  (DD))  is  a  software  tool  that  is 
used  to  control  the  totality  of  data  elements  within  an  ap- 
plication. It  is  viewed  as  the  central  repository  of  all 
descriptive  information  about  each  data  element  contained 
within  an  application  data  base.  It  lists,  describes,  and 
locates  each  data-  element  in  a  data  base.  The  dictionary 
does  not  manage  the  actual  content  of  the  data,  but  it  does 
manage  the  descriptive  characteristics  of  data. 

The  dictionary  portion  is  a  glossary  of  terms  of  data 
elements  representing  their  characteristics  and  logical  re- 
lationships with  data  base  components  and  application  usage. 
The  dictionary  describes  what  data  are  contained  in  the 
organization's  data  base. 

The  directory  portion  contains  object  data  definitions 
as  used  by  the  computer,  plus  physical  storage  locations  and 
access  strategies.  The  directory  locates  where  data  are 
stored . 

++Usage . 

The  DD  serves  the  database  administrator,  systems 
analyst,  software  designer,  and  programmer  by  providing  a 
central  repository  for  information  about  data  resources.  It 
aids  people  in  planning,  controlling,  and  evaluating  the 
collection,  storage,  and  use  of  the  data  resources. 

As  a  documentation  tool,  it  captures  descriptive  infor- 
mation and  aids  in  establishing  standards  for  data  naming, 
usage,  and  coding  conventions.  It  identifies  users  or  pro- 
grams which  may  be  affected  by  any  changes,  additions,  or 
deletions  to  the  data. 

If  the  DD  is  interfaced  into  the  structure  of  a  DBMS, 
tiien  it  may  be  used  for  generating  the  data  description 
language  of  the  DBMS,  maintaining  data  descriptions  and  com- 
piling or  interpreting  program  references  which  use  these 
descriptions. 
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++Functions  ojf  Data  Pi  ctionary . 

The  primary  function  of  a  DD  is  to  provide  a  method  of 
centralized  control  over  data  elements.  The  following  are 
some  functions  of  existing  DD.  No  one  system  performs  them 
al  1  . 


Data  attribute  -  Name,  Length,  Type 

Data  relationship  -  The  structural  properties  among 
data 

Synonyms  -  Allow  alias  capability 

Textual  description 

Access  control  specification 

Edit  and  validation  check  rules  -  e.g.,  null  and  de- 
fault values,  ranges  of  value  permitted,  etc. 

Output  decoding  specification  -  coded  values  to  be 
translated  so  that  it  is  human- readabl e . 

Key/Non-key  -  Indicate  whether  the  data  element  is 
meant  to  be  searchable  or  not. 

Physical  storage  specification  -  logical  and  physi- 
cal address,  internal  representation,  compaction, 
etc . 

Usage  information  -  cross-references  to  users,  ap- 
plication programs,  and  output  reports. 

Frequency  of  access  -  usually  used  for  statistical 
monitoring  purposes. 


++Ty pes  and  Commerc i al  Dd ' s . 

There  are  at  present  at  least  four  broad  groups  of  data 
dictionary  systems  existing  in  the  market  that  can  be  iden- 
tified : 
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*  Type  A  -  The  free-standing  package  which  could  be  used 
in  a  non-DBMS  environment. 


DD 


Examples  of  commercial  DD  are: 

DICTIONARY  SUPPLIERS 

DATA  CATALOGUE  Synergetics  Corp. 

DATAMANAGER  Management  Systems  &  Programming,  Ltd. 

LEXICON  Arthur  Anderson  &  Co, 


*  Type  B  -  The  free-standing  package  that  optionally  pro- 
vides interfaces  to  one  or  more  DBMS. 


Examples  of  commercial   DD  are: 
DICTIONARY  SUPPLIERS 


DBMS  INTERFACE 


DATA  CATALOGUE 
DATAMANAGER 


Synergetics  Corp 

Management  Systems 
and  Programming,  Ltd 


IMS,  TOTAL 
ADABAS 

IDMS,   IMS,  TOTAL 


LEXICON 


Arthur  Anderson  &  Co 


IDMS,   IMS,  TOTAL 
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*  Type  C_  -  The  single  data  dictionary  package  designed  to 
co-exist  with  a  particular  DBMS.  This  type  of  data  diction- 
ary is  solely  dependent  on  the  DBMS  and  usually  developed 
and  marketed  by  the  same  vendor. 


Examples  of  commercial  DD  packages  are 


DICTIONARY 


SUPPLIER 


DBMS  REQUIRED 


Control  2000 
DB/DC  Dictionary 
UCC  TEN 

Data  Control  System 
Data  D  i  c  t  i  0  n  a  ry 
IDD 


MRI  Systems 
IBM 

University  Computing 
Haverly  Systems 
Ci  ncom  Systems 
C  u  1 1  i  n  a  n  e 


System  2000 
IMS  or  DL/1 
IMS 

DMS-1100 

TOTAL 

IDMS 


*  Type  D  -  The  data  dictionary  that  is  embedded  within  a 
DBMS.  The  data  dictionary  function  usually  is  part  of  the 
data  definition  function  and  the  meta-data  is  stored  as  part 
of  the  database  for  the  DBMS. 


Examples    of  commercial  DD  are: 

DICTIONARY  SUPPLIER  DBMS  REQUIRED 


GIM  II  Dictionary  TRW  GIM  II 
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++I  n f o^rmat  1  on  I  n  _a  Data  Pi  ctionary  System  . 

A  data  dictionary  captures  information  about  data  ele- 
ments. Some  of  this  information  seems  to  be  provided  by  all 
data  dictionary  packages  and  deemed  essential  while  others, 
though  desirable,  may  be  omitted  at  the  discretion  of  the 
impl ementor . 

To  concentrate  on  the  information  requirements  of  those 
involved  in  data  and  database  administration,  several  types 
of  information  usage  were  identified.  Certain  information 
is  captured  to  be  used  by  people  for  documentation  and  con- 
trol purposes,  while  other  information  is  necessary  for  the 
generation  of  data  definition  language,  data  manipulation 
language  and  application  program  processors.  For  illustra- 
tive purposes  the  following  table  shows  the  information  cap- 
tured in  a  data  dictionary  and  its  uses  by  various  users. 
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•1!  

II 

II  DDL 
II 

•  11  


 11  

11 

DML     II  Appl  ication 
II  Processor 
11 

•  11  -■ 


I n  format  i  on 
in  DD 


People 


Name,  Type 
Length 


Structure 
Relationship 


Synonyms 


Textural  Desc. 


Access  Control 


Edit  &  Val idat ion 


Key/Non-Key 
Usage  Info. 


Freq.  of  Access 


Physical  Storage 
Specification 
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7.7     APPENDIX  7  -  END-USER/QUERY  LANGUAGE 


The  following  subjects  were  considered    when  preparing 
the  End  User  Facility  recommendations. 


++L i  St  of  End  User  F a c i 1 i  t  i  e s  . 

1.  Report  Specification  Language 

2.  Enquiry  Specification  Language 

3.  Update  Specification  Language 

4.  Parametric  Interface 


++Li_st  £f  Classes  Of  Users . 

1.  System  Anal yst s/DBA ' s  Staff 

2.  Application  Programmers 

3.  On-line  Job- trained  User 

4.  Researcher 

5.  Casual  User 


++Measures  of  Query  Languages. 

1.  Quantitative  Measures 

a .  Level 

b.  Completeness 

2.  Qualitative  Measures 

a.  Mathematical  Sophistication 
b  .     Learnab  i 1 ity 

c .  Procedural ity 


++Li  St  of  Variable  Features  In  End  User  F a c i 1 i t  i  e s  . 

A.     Procedural ity 
1.  Procedural 

a .  logic 

b.  navigation 


2.     Non- proced ural 

a .  form  fill 

b.  question/answer 

c .  menu 

B.     Report  formatting 

a .  flexible 

b.  pre-f ormatted 

c.  automated  display  of  draft  format 
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d .     construction  assistance 


C.     Request  Analysis 

a .  monolithic 

b.  incremental 


D.  Level 

a.  %  code  reduction  vs.  COBOL  (keystrokes) 

b.  %  coding  tim'e  reduction  vs.  COBOL 

E.  Default  treatment 

F.  Ratio  of  simplicity  to  functionality 

G.  Retrieval  criteria  permitted 

a.  mathematical  operators 

b.  logical  operators 

c.  string  operators  (begins,  contains) 

d.  set  functions  (count,  sum,  mean,  standard  development, 
maximum,  minimum) 


H.  Cascading  retrieval 

I .  Extensibil ity 

J.     Processing  of  conditionals 


K.     Update  capabil ity 

a.  within  queries 

b.  similar  to  query  syntax 

c.  special  syntax 

d.  transaction  oriented 

e.  form-fill 


L.  Navigation 

a.  DBA  intervention  required  to  change  access  paths 

b.  fixed  path  chosen 

c.  multi-path  ambiguity  negotiated 


M.     Execution  efficiency 

N.    Aggregation  operations 

a .  sort  f 1 exibil ity 

b.  sort  efficiency 

c.  data  clumping  capability 

0.     Reusability  of  user  constructs 


a . 

s  i  m  p  1 

icity 

of 

retention 

b. 

s  i  mpl 

ic  i  ty 

of 

t empora  ry 

mod  i  f i  c at  i  on 

c . 

s  i  mpl 

icity 

of 

permanent 

modi  f i  cation 

P.     Data  type  support 
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a.  different  types  handled 

b.  conversion  support 

c.  ease  of  addition  of  special  types 


Q.     Fog  index 

a.  documentation 

b.  syntax 

R.  Benchmarking 
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j  ordinated  by  NBS.  Program  under  authority  of  National 
1' Standard  Data  Act  (Public  Law  90-396). 


NOTE:  At  present  the  principal  publication  outlet  for  these 
data  is  the  Journal  of  Physical  and  Chemical  Reference 
Data  (JPCRD)  published  quarterly  for  NBS  by  the  Ameri- 
can Chemical  Society  (ACS)  and  the  American  Institute  of 
Physics  (AIP).  Subscriptions,  reprints,  and  supplements 
available  from  ACS,  1155  Sixteenth  St.  N.W.,  Wash.,  D.C. 
20056. 

Building  Science  Series — Disseminates  technical  information 
developed  at  the  Bureau  on  building  materials,  components, 
systems,  and  whole  structures.  The  series  presents  research 
results,  test  methods,  and  performance  criteria  related  to  the 
structural  and  environmental  functions  and  the  durability 
and  safety  characteristics  of  building  elements  and  systems. 
Technical  Notes — Studies  or  reports  which  are  complete  in 
themselves  but  restrictive  in  their  treatment  of  a  subject. 
Analogous  to  monographs  but  not  so  comprehensive  in 
scope  or  definitive  in  treatment  of  the  subject  area.  Often 
serve  as  a  vehicle  for  final  reports  of  work  performed  at 
NBS  under  the  sponsorship  of  other  government  agencies. 
Voluntary  Product  Standards — Developed  under  procedures 
published  by  the  Department  of  Commerce  in  Part  10, 
Title  15,  of  the  Code  of  Federal  Regulations.  The  purpose 
of  the  standards  is  to  establish  nationally  recognized  require- 
ments for  products,  and  to  provide  all  concerned  interests 
with  a  basis  for  common  understanding  of  the  characteristics 
of  the  products.  NBS  administers  this  program  as  a  supple- 
ment to  the  activities  of  the  private  sector  standardizing 
organizations. 

Consumer  Information  Series — Practical  information,  based 
on  NBS  research  and  experience,  covering  areas  of  interest 
to  the  consumer.  Easily  understandable  language  and 
illustrations  provide  useful  background  knowledge  for  shop- 
ping in  today's  technological  marketplace. 
Order  above  NBS  publications  from:  Superintendent  of 
Documents,  Government  Printing  Office,  Washington,  D.C. 
20402. 

Order  following  NBS  publications— NBSIR's  and  FIPS  from 
the  National  Technical  Information  Services,  Springfield, 
Va.  22161. 

Federal  Information  Processing  Standards  Publications 
(FIPS  PUB) — Publications  in  this  series  collectively  consti- 
tute the  Federal  Information  Processing  Standards  Register. 
Register  serves  as  the  official  source  of  information  in  the 
Federal  Government  regarding  standards  issued  by  NBS 
pursuant  to  the  Federal  Property  and  Administrative  Serv- 
ices Act  of  1949  as  amended.  Public  Law  89-306  (79  Stat. 
1127),  and  as  implemented  by  Executive  Order  11717 
(38  FR  12315,  dated  May  11,  1973)  and  Part  6  of  Title  15 
CFR  (Code  of  Federal  Regulations). 

NBS  Interagency  Reports  (NBSIR) — A  special  series  of 
interim  or  final  reports  on  work  performed  by  NBS  for 
outside  sponsors  (both  government  and  non-government). 
In  general,  initial  distribution  is  handled  by  the  sponsor; 
public  distribution  is  by  the  National  Technical  Information 
Services  (Springfield,  Va.  22161)  in  paper  copy  or  microfiche 
form. 


BIBLIOGRAPHIC  SUBSCRIPTION  SERVICES 


I  The  following  current-awareness  and  literature-survey  bibli- 
I  ographies  are  issued  periodically  by  the  Bureau: 
I  Cryogenic  Data  Center  Current  Awareness  Service.  A  litera- 
ture survey  issued  biweekly.  Annual  subscription:  Domes- 
tic, $25.00;  Foreign,  $30.00. 
Liquefied  Natural  Gas.  A  literature  survey  issued  quarterly. 
Annual  subscription:  $20.00. 


Superconducting  Devices  and  Materials.  A  literature  survey 
issued  quarterly.  Annual  subscription:  $30.00.  Send  subscrip- 
tion orders  and  remittances  for  the  preceding  bibliographic 
services  to  National  Bureau  of  Standards,  Cryogenic  Data 
Center  (275.02)  Boulder,  Colorado  80302. 
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